Web search
Web search lets models retrieve current web information while handling a POST /v1/responses request. Use it when a model needs recent facts, source-grounded answers, or domain-filtered web context that is not already in your prompt.
Gateway executes web search as a server tool. Your application adds the tool to the request, the model decides whether to search, Gateway routes the search through the selected provider, and the model receives the results before writing the final answer.
Web search searches public web pages. It does not search your application’s function tools, private data, or internal documents.
How it works
- Add
{ "type": "merge:web_search" }to thetoolsarray - Gateway converts it to an internal function tool named
merge_web_search - The model calls
merge_web_searchif it needs web context - Gateway sends the query to the selected provider with your configured limits and filters
- Gateway appends the result snippets as tool output and continues the model request
- Gateway returns the final model response with URL citations and web search usage
Gateway also adds a system instruction that tells the model to treat web snippets as untrusted evidence, ignore instructions inside search results, and cite source URLs when using web information.
Send a request with web search
You can also use openrouter:web_search as an alias for compatibility with OpenRouter-style requests.
Configure web search
The web search tool accepts optional parameters to control result count, context size, domain filters, and timeout behavior.
Supported providers
Gateway currently supports the following web search providers:
exa
Use engine: "auto" to let Gateway choose from supported providers. This is the recommended default because Gateway can add more providers over time without requiring code changes in your application.
Set engine to a provider name only when you want to pin requests to that provider.
The provider-shaped request model lets Gateway add more providers later behind the same server tool.
Choose a context size
Gateway asks the selected provider for token-efficient snippets instead of full page text. The provider returns snippets selected as relevant to the search query.
If you omit search_context_size, Gateway uses the provider’s adaptive snippet behavior. Set a fixed cap when you want predictable maximum context per result.
With the current provider, these values map to the following maximum snippet sizes.
Future providers may map these same values to provider-specific context controls.
Filter domains
Use allowed_domains and excluded_domains to control which domains can appear in search results. Domain filters accept hostnames, not full URL paths.
You can use allowed_domains and excluded_domains together with the currently supported provider.
Limit total results
The model can search multiple times in a single request. Use max_total_results to cap cumulative search results across the full request.
When the cap is reached, Gateway returns a tool error to the model instead of running another search. This controls cost and context window usage during multi-step requests.
Force a search
By default, the model decides whether to search. To require a search, set tool_choice to the server tool type.
Pick tool_choice: "auto" when search is optional. Pick an explicit web search tool_choice when the user explicitly asks for current sources.
Stream responses
Web search works with stream: true.
When the model calls the search tool, Gateway finishes that internal tool-call turn, runs the search, sends the result back to the model, and then streams the final answer. The first visible token can arrive later when the model chooses to search, because Gateway must complete the search before streaming the final answer.
Read usage and citations
Web search usage appears under usage.server_tool_use.
Gateway also adds URL citation annotations to text output when the provider returns search results.
Understand pricing
Web search charges are separate from model token costs. The following rates apply to the currently supported provider.
Gateway applies your organization’s effective billing margin to the web search raw cost. Web search usage is also logged separately, so you can break out web search requests, results, raw cost, and billed amount in usage reporting.
Availability and limits
- Supported providers:
exa - Web search is not available when zero data retention is enabled
- Only one web search server tool may be included in a single request
- Search loops are bounded by Gateway’s model-iteration limit and by
max_total_results - Search provider failures are returned to the model as tool errors