Web search

Give models access to current web information through Gateway

Web search lets models retrieve current web information while handling a POST /v1/responses request. Use it when a model needs recent facts, source-grounded answers, or domain-filtered web context that is not already in your prompt.

Gateway executes web search as a server tool. Your application adds the tool to the request, the model decides whether to search, Gateway routes the search through the selected provider, and the model receives the results before writing the final answer.

Web search searches public web pages. It does not search your application’s function tools, private data, or internal documents.

How it works

  1. Add { "type": "merge:web_search" } to the tools array
  2. Gateway converts it to an internal function tool named merge_web_search
  3. The model calls merge_web_search if it needs web context
  4. Gateway sends the query to the selected provider with your configured limits and filters
  5. Gateway appends the result snippets as tool output and continues the model request
  6. Gateway returns the final model response with URL citations and web search usage

Gateway also adds a system instruction that tells the model to treat web snippets as untrusted evidence, ignore instructions inside search results, and cite source URLs when using web information.

$curl https://api-gateway.merge.dev/v1/responses \
> -H "Authorization: Bearer YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "anthropic/claude-sonnet-4-5",
> "input": [
> {
> "type": "message",
> "role": "user",
> "content": "What were the biggest AI infrastructure announcements this week? Include sources."
> }
> ],
> "tools": [
> { "type": "merge:web_search" }
> ]
> }'

You can also use openrouter:web_search as an alias for compatibility with OpenRouter-style requests.

1{
2 "tools": [
3 { "type": "openrouter:web_search" }
4 ]
5}

The web search tool accepts optional parameters to control result count, context size, domain filters, and timeout behavior.

1{
2 "type": "merge:web_search",
3 "parameters": {
4 "engine": "auto",
5 "max_results": 5,
6 "max_total_results": 20,
7 "search_context_size": "medium",
8 "allowed_domains": ["arxiv.org", "nature.com"],
9 "excluded_domains": ["reddit.com"],
10 "user_location": {
11 "country": "US"
12 },
13 "timeout_seconds": 10
14 }
15}
ParameterTypeDefaultDescription
enginestringautoSearch provider selection. Use auto to let Gateway choose the provider, or set a supported provider explicitly
max_resultsinteger5Maximum results per search call. Must be between 1 and 25
max_total_resultsintegermax_results * 5Maximum total results across all search calls in one request. Must be between 1 and 250
search_context_sizestringAdaptive snippetsAmount of snippet context per result. Use low, medium, or high
allowed_domainsstring[]NoneOnly return results from these domains
excluded_domainsstring[]NoneExclude results from these domains
user_locationobjectNoneOptional location hint. Gateway currently uses country when provided
timeout_secondsnumber10Timeout for the search provider request
api_keystringMerge-managed keyOptional search-provider API key override for this request

Supported providers

Gateway currently supports the following web search providers:

  • exa

Use engine: "auto" to let Gateway choose from supported providers. This is the recommended default because Gateway can add more providers over time without requiring code changes in your application.

1{
2 "type": "merge:web_search",
3 "parameters": {
4 "engine": "auto"
5 }
6}

Set engine to a provider name only when you want to pin requests to that provider.

1{
2 "type": "merge:web_search",
3 "parameters": {
4 "engine": "exa"
5 }
6}

The provider-shaped request model lets Gateway add more providers later behind the same server tool.

Choose a context size

Gateway asks the selected provider for token-efficient snippets instead of full page text. The provider returns snippets selected as relevant to the search query.

If you omit search_context_size, Gateway uses the provider’s adaptive snippet behavior. Set a fixed cap when you want predictable maximum context per result.

With the current provider, these values map to the following maximum snippet sizes.

ValueCurrent provider snippet cap
low5,000 characters
medium15,000 characters
high30,000 characters

Future providers may map these same values to provider-specific context controls.

Filter domains

Use allowed_domains and excluded_domains to control which domains can appear in search results. Domain filters accept hostnames, not full URL paths.

1{
2 "type": "merge:web_search",
3 "parameters": {
4 "allowed_domains": ["arxiv.org", "nature.com"],
5 "excluded_domains": ["reddit.com"]
6 }
7}

You can use allowed_domains and excluded_domains together with the currently supported provider.

Limit total results

The model can search multiple times in a single request. Use max_total_results to cap cumulative search results across the full request.

1{
2 "type": "merge:web_search",
3 "parameters": {
4 "max_results": 5,
5 "max_total_results": 15
6 }
7}

When the cap is reached, Gateway returns a tool error to the model instead of running another search. This controls cost and context window usage during multi-step requests.

By default, the model decides whether to search. To require a search, set tool_choice to the server tool type.

1{
2 "model": "anthropic/claude-sonnet-4-5",
3 "input": [
4 {
5 "type": "message",
6 "role": "user",
7 "content": "Find recent sources about GPU export control changes."
8 }
9 ],
10 "tools": [
11 { "type": "merge:web_search" }
12 ],
13 "tool_choice": { "type": "merge:web_search" }
14}

Pick tool_choice: "auto" when search is optional. Pick an explicit web search tool_choice when the user explicitly asks for current sources.

Stream responses

Web search works with stream: true.

When the model calls the search tool, Gateway finishes that internal tool-call turn, runs the search, sends the result back to the model, and then streams the final answer. The first visible token can arrive later when the model chooses to search, because Gateway must complete the search before streaming the final answer.

1{
2 "model": "anthropic/claude-sonnet-4-5",
3 "stream": true,
4 "input": [
5 {
6 "type": "message",
7 "role": "user",
8 "content": "What changed in the latest Kubernetes release?"
9 }
10 ],
11 "tools": [
12 { "type": "merge:web_search" }
13 ]
14}

Read usage and citations

Web search usage appears under usage.server_tool_use.

1{
2 "usage": {
3 "input_tokens": 105,
4 "output_tokens": 250,
5 "server_tool_use": {
6 "web_search_requests": 2,
7 "web_search_results": 10
8 }
9 }
10}

Gateway also adds URL citation annotations to text output when the provider returns search results.

1{
2 "type": "text",
3 "text": "The latest release adds...",
4 "annotations": [
5 {
6 "type": "url_citation",
7 "url": "https://kubernetes.io/blog/release-notes",
8 "title": "Kubernetes release notes",
9 "content": "Relevant excerpt from the page"
10 }
11 ]
12}

Understand pricing

Web search charges are separate from model token costs. The following rates apply to the currently supported provider.

ItemPrice
Search request$0.007 per request
Included results10 results per request
Additional results$0.001 per result above the included result count

Gateway applies your organization’s effective billing margin to the web search raw cost. Web search usage is also logged separately, so you can break out web search requests, results, raw cost, and billed amount in usage reporting.

Availability and limits

  • Supported providers: exa
  • Web search is not available when zero data retention is enabled
  • Only one web search server tool may be included in a single request
  • Search loops are bounded by Gateway’s model-iteration limit and by max_total_results
  • Search provider failures are returned to the model as tool errors

Next steps