Merge API Documentation

Fusion sends a single prompt to several models at the same time, then a judge merges their answers into one consolidated response. Every model in the panel answers the whole prompt independently, and the judge reconciles them. Fusion does not split the prompt into sub-tasks.

Use Fusion when answer quality matters more than latency or cost. A panel of several mid-tier models, reconciled by a judge, often beats any single one of them.

Fusion calls every panel model plus the judge, so a single request bills for all of them. Fusion currently supports non-streaming responses only: the response returns once the judge runs and synthesizes the responses from the panel models.

How fusion works

Gateway sends the full prompt to every model in analysis_models at the same time. Each returns a complete, independent answer.
A panel model that errors or is blocked by policy is dropped. Fusion still succeeds as long as at least one model returns a usable answer.
The surviving answers, plus the original conversation, go to the judge (synthesis_model), which reconciles agreements, resolves contradictions toward the best-supported claim, folds in unique insights, and writes one final answer.
You receive a standard /v1/responses result whose output is the merged answer, plus a fusion metadata block describing the run.

Make a fusion request

Set model to fusion and include a fusion block on a normal POST /v1/responses request.

cURL

$ curl "https://api-gateway.merge.dev/v1/responses" \
>   -H "Authorization: Bearer YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "fusion",
>     "input": [
>       {"type": "message", "role": "user", "content": "Explain the CAP theorem simply."}
>     ],
>     "fusion": {
>       "analysis_models": ["openai/gpt-4o-mini", "anthropic/claude-haiku-4-5"],
>       "synthesis_model": "anthropic/claude-haiku-4-5"
>     }
>   }'

The fusion block

Field	Type	Required	Description
`analysis_models`	array of strings	Yes	Parallel panel of at least two distinct model IDs (`provider/model` or a bare model name); duplicates are removed
`synthesis_model`	string	No	Judge that merges the candidates; defaults to the first entry in `analysis_models`
`synthesis_instructions`	string	No	Override the instruction given to the judge, for example to force a specific format
`preset`	string	No	Reserved for future curated panels; advisory in v1
`include_candidates`	boolean	No	Include each panel model’s answer in the response (default `true`); set `false` to return the merged answer only

Standard generation parameters on the request (max_tokens, temperature, top_p, stop, tags, project_id) are forwarded to each panel model. Note that max_tokens applies to each panel model and to the judge individually, so leave room for the judge to write the merged answer.

Web search in fusion

Add the web search server tool to a fusion request and every panel model can ground its answer with live web results. Use this when the prompt needs current facts or source citations, such as research-style questions.

cURL

$ curl "https://api-gateway.merge.dev/v1/responses" \
>   -H "Authorization: Bearer YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "model": "fusion",
>     "input": [
>       {"type": "message", "role": "user", "content": "What changed in EU AI regulation this month? Cite sources."}
>     ],
>     "fusion": {
>       "analysis_models": ["deepseek/deepseek-v4-pro", "anthropic/claude-sonnet-5"],
>       "synthesis_model": "anthropic/claude-sonnet-5"
>     },
>     "tools": [
>       {"type": "merge:web_search"}
>     ]
>   }'

How it behaves:

Each panel model runs its own web search loop, so panelists can search for different things and surface different sources.
The judge does not search. It receives each candidate’s answer together with that candidate’s source list, and it is instructed to keep citations attached to the claims it keeps.
Each entry in fusion.candidates includes an annotations array with the URL citations that panel model gathered.
The final merged answer carries URL citation annotations for the sources the judge kept. A citation whose claim was dropped during synthesis is dropped with it.

Fusion accepts only web search server tools. Client-side function tools return a 400 with code fusion_unsupported_tools, because the panel and the judge run entirely inside Gateway, so there is no way to return a tool call to your application mid-fusion. Web search configuration (engine, max_results, domain filters) works the same as on a single-model request.

Read the response

The output is the judge’s merged answer, in the standard /v1/responses shape. A fusion object describes the run.

1 {
2   "model": "fusion",
3   "output": [
4     {
5       "type": "message",
6       "role": "assistant",
7       "content": [{"type": "text", "text": "The merged answer."}]
8     }
9   ],
10   "usage": {"input_tokens": 675, "output_tokens": 814, "total_tokens": 1489},
11   "fusion": {
12     "synthesis_model": "anthropic/claude-haiku-4-5",
13     "models_requested": ["openai/gpt-4o-mini", "anthropic/claude-haiku-4-5"],
14     "models_succeeded": 2,
15     "candidates": [
16       {
17         "model": "openai/gpt-4o-mini",
18         "vendor": "openai",
19         "status": "ok",
20         "text": "That model's full answer.",
21         "usage": {"total_tokens": 241}
22       },
23       {
24         "model": "anthropic/claude-haiku-4-5",
25         "vendor": "anthropic",
26         "status": "ok",
27         "text": "That model's full answer.",
28         "usage": {"total_tokens": 317}
29       }
30     ]
31   }
32 }

Field	Description
`output`	Final merged answer, in the standard `/v1/responses` shape
`fusion.models_succeeded`	Number of panel models that contributed
`fusion.candidates`	Each panel model’s answer, vendor, `status` (`ok` or `error`), and usage; empty when `include_candidates` is `false`
`fusion.candidates[].annotations`	URL citations that panel model gathered when web search is enabled; omitted otherwise
`usage`	Rolled-up totals across every panel call and the judge

Billing

You are billed for every underlying call: each panel model and the judge, metered the same way as any other request. The top-level usage is the sum across all of them, so a three-model panel is roughly four calls. Budget accordingly.

Errors

Status	Code	Cause
400	`fusion_config_required`	`model` is `fusion` but no `fusion` block was provided
400	`fusion_stream_unsupported`	`stream` was `true`; fusion is non-streaming in v1
400	`empty_input`	`input` is empty
400	`fusion_unsupported_tools`	`tools` contains anything other than web search server tools
400	`web_search_unavailable`	Web search was requested but no search provider is configured
422	Validation error	`analysis_models` has fewer than two distinct models
502	Provider error	Every panel model failed, or the judge call failed

Limitations

Fusion is non-streaming, requires an explicit analysis_models panel (there is no curated default), and treats preset as advisory. The only tool fusion supports is the web search server tool; client-side function tools are not available inside a fusion request.