Merge API Documentation

Web search lets models retrieve current web information while handling a POST /v1/responses request. Use it when a model needs recent facts, source-grounded answers, or domain-filtered web context that is not already in your prompt.

Gateway executes web search as a server tool. Your application adds the tool to the request, the model decides whether to search, Gateway routes the search through the selected provider, and the model receives the results before writing the final answer.

Web search searches public web pages. It does not search your application’s function tools, private data, or internal documents.

How it works

Add { "type": "merge:web_search" } to the tools array
Gateway converts it to an internal function tool named merge_web_search
The model calls merge_web_search if it needs web context
Gateway sends the query to the selected provider with your configured limits and filters
Gateway appends the result snippets as tool output and continues the model request
Gateway returns the final model response with URL citations and web search usage

Gateway also adds a system instruction that tells the model to treat web snippets as untrusted evidence, ignore instructions inside search results, and cite source URLs when using web information.

Send a request with web search

$ curl https://api-gateway.merge.dev/v1/responses \
>   -H "Authorization: Bearer YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "anthropic/claude-sonnet-4-5",
>     "input": [
>       {
>         "type": "message",
>         "role": "user",
>         "content": "What were the biggest AI infrastructure announcements this week? Include sources."
>       }
>     ],
>     "tools": [
>       { "type": "merge:web_search" }
>     ]
>   }'

You can also use openrouter:web_search as an alias for compatibility with OpenRouter-style requests.

1 {
2   "tools": [
3     { "type": "openrouter:web_search" }
4   ]
5 }

Configure web search

The web search tool accepts optional parameters to control result count, context size, domain filters, and timeout behavior.

1 {
2   "type": "merge:web_search",
3   "parameters": {
4     "engine": "auto",
5     "max_results": 5,
6     "max_total_results": 20,
7     "search_context_size": "medium",
8     "allowed_domains": ["arxiv.org", "nature.com"],
9     "excluded_domains": ["reddit.com"],
10     "user_location": {
11       "country": "US"
12     },
13     "timeout_seconds": 10
14   }
15 }

Parameter	Type	Default	Description
`engine`	string	`auto`	Search provider selection. Use `auto` to let Gateway choose the provider, or set a supported provider explicitly
`max_results`	integer	`5`	Maximum results per search call. Must be between `1` and `25`
`max_total_results`	integer	`max_results * 5`	Maximum total results across all search calls in one request. Must be between `1` and `250`
`search_context_size`	string	Adaptive snippets	Amount of snippet context per result. Use `low`, `medium`, or `high`
`allowed_domains`	string[]	None	Only return results from these domains
`excluded_domains`	string[]	None	Exclude results from these domains
`user_location`	object	None	Optional location hint. Gateway currently uses `country` when provided
`timeout_seconds`	number	`10`	Timeout for the search provider request
`api_key`	string	Merge-managed key	Optional search-provider API key override for this request

Supported providers

Gateway currently supports the following web search providers:

exa

Use engine: "auto" to let Gateway choose from supported providers. This is the recommended default because Gateway can add more providers over time without requiring code changes in your application.

1 {
2   "type": "merge:web_search",
3   "parameters": {
4     "engine": "auto"
5   }
6 }

Set engine to a provider name only when you want to pin requests to that provider.

1 {
2   "type": "merge:web_search",
3   "parameters": {
4     "engine": "exa"
5   }
6 }

The provider-shaped request model lets Gateway add more providers later behind the same server tool.

Choose a context size

Gateway asks the selected provider for token-efficient snippets instead of full page text. The provider returns snippets selected as relevant to the search query.

If you omit search_context_size, Gateway uses the provider’s adaptive snippet behavior. Set a fixed cap when you want predictable maximum context per result.

With the current provider, these values map to the following maximum snippet sizes.

Value	Current provider snippet cap
`low`	5,000 characters
`medium`	15,000 characters
`high`	30,000 characters

Future providers may map these same values to provider-specific context controls.

Filter domains

Use allowed_domains and excluded_domains to control which domains can appear in search results. Domain filters accept hostnames, not full URL paths.

1 {
2   "type": "merge:web_search",
3   "parameters": {
4     "allowed_domains": ["arxiv.org", "nature.com"],
5     "excluded_domains": ["reddit.com"]
6   }
7 }

You can use allowed_domains and excluded_domains together with the currently supported provider.

Limit total results

The model can search multiple times in a single request. Use max_total_results to cap cumulative search results across the full request.

1 {
2   "type": "merge:web_search",
3   "parameters": {
4     "max_results": 5,
5     "max_total_results": 15
6   }
7 }

When the cap is reached, Gateway returns a tool error to the model instead of running another search. This controls cost and context window usage during multi-step requests.

Force a search

By default, the model decides whether to search. To require a search, set tool_choice to the server tool type.

1 {
2   "model": "anthropic/claude-sonnet-4-5",
3   "input": [
4     {
5       "type": "message",
6       "role": "user",
7       "content": "Find recent sources about GPU export control changes."
8     }
9   ],
10   "tools": [
11     { "type": "merge:web_search" }
12   ],
13   "tool_choice": { "type": "merge:web_search" }
14 }

Pick tool_choice: "auto" when search is optional. Pick an explicit web search tool_choice when the user explicitly asks for current sources.

Stream responses

Web search works with stream: true.

When the model calls the search tool, Gateway finishes that internal tool-call turn, runs the search, sends the result back to the model, and then streams the final answer. The first visible token can arrive later when the model chooses to search, because Gateway must complete the search before streaming the final answer.

1 {
2   "model": "anthropic/claude-sonnet-4-5",
3   "stream": true,
4   "input": [
5     {
6       "type": "message",
7       "role": "user",
8       "content": "What changed in the latest Kubernetes release?"
9     }
10   ],
11   "tools": [
12     { "type": "merge:web_search" }
13   ]
14 }

Read usage and citations

Web search usage appears under usage.server_tool_use.

1 {
2   "usage": {
3     "input_tokens": 105,
4     "output_tokens": 250,
5     "server_tool_use": {
6       "web_search_requests": 2,
7       "web_search_results": 10
8     }
9   }
10 }

Gateway also adds URL citation annotations to text output when the provider returns search results.

1 {
2   "type": "text",
3   "text": "The latest release adds...",
4   "annotations": [
5     {
6       "type": "url_citation",
7       "url": "https://kubernetes.io/blog/release-notes",
8       "title": "Kubernetes release notes",
9       "content": "Relevant excerpt from the page"
10     }
11   ]
12 }

Understand pricing

Web search charges are separate from model token costs. The following rates apply to the currently supported provider.

Item	Price
Search request	`$0.007` per request
Included results	10 results per request
Additional results	`$0.001` per result above the included result count

Gateway applies your organization’s effective billing margin to the web search raw cost. Web search usage is also logged separately, so you can break out web search requests, results, raw cost, and billed amount in usage reporting.

Availability and limits

Supported providers: exa
Web search is not available when zero data retention is enabled
Only one web search server tool may be included in a single request
Search loops are bounded by Gateway’s model-iteration limit and by max_total_results
Search provider failures are returned to the model as tool errors

Next steps

Tool calling

Define your own application tools and return tool results to the model

Cost governance and savings

Track and control spend across model usage and server tools

Zero data retention

Understand which Gateway features are available with zero data retention