Fusion

Send one prompt to several models at once and merge their answers into a single response

Fusion sends a single prompt to several models at the same time, then a judge merges their answers into one consolidated response. Every model in the panel answers the whole prompt independently, and the judge reconciles them. Fusion does not split the prompt into sub-tasks.

Use Fusion when answer quality matters more than latency or cost. A panel of several mid-tier models, reconciled by a judge, often beats any single one of them.

Fusion calls every panel model plus the judge, so a single request bills for all of them. It is also non-streaming in v1: the response returns once the judge has merged the candidates.

How fusion works

  1. Gateway sends the full prompt to every model in analysis_models at the same time. Each returns a complete, independent answer.
  2. A panel model that errors or is blocked by policy is dropped. Fusion still succeeds as long as at least one model returns a usable answer.
  3. The surviving answers, plus the original conversation, go to the judge (synthesis_model), which reconciles agreements, resolves contradictions toward the best-supported claim, folds in unique insights, and writes one final answer.
  4. You receive a standard /v1/responses result whose output is the merged answer, plus a fusion metadata block describing the run.

Make a fusion request

Set model to fusion and include a fusion block on a normal POST /v1/responses request.

cURL
$curl "https://api-gateway.merge.dev/v1/responses" \
> -H "Authorization: Bearer YOUR_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "fusion",
> "input": [
> {"type": "message", "role": "user", "content": "Explain the CAP theorem simply."}
> ],
> "fusion": {
> "analysis_models": ["openai/gpt-4o-mini", "anthropic/claude-haiku-4-5"],
> "synthesis_model": "anthropic/claude-haiku-4-5"
> }
> }'

The fusion block

FieldTypeRequiredDescription
analysis_modelsarray of stringsYesParallel panel of at least two distinct model IDs (provider/model or a bare model name); duplicates are removed
synthesis_modelstringNoJudge that merges the candidates; defaults to the first entry in analysis_models
synthesis_instructionsstringNoOverride the instruction given to the judge, for example to force a specific format
presetstringNoReserved for future curated panels; advisory in v1
include_candidatesbooleanNoInclude each panel model’s answer in the response (default true); set false to return the merged answer only

Standard generation parameters on the request (max_tokens, temperature, top_p, stop, tags, project_id) are forwarded to each panel model.

Read the response

The output is the judge’s merged answer, in the standard /v1/responses shape. A fusion object describes the run.

1{
2 "model": "fusion",
3 "output": [
4 {
5 "type": "message",
6 "role": "assistant",
7 "content": [{"type": "text", "text": "The merged answer."}]
8 }
9 ],
10 "usage": {"input_tokens": 675, "output_tokens": 814, "total_tokens": 1489},
11 "fusion": {
12 "synthesis_model": "anthropic/claude-haiku-4-5",
13 "models_requested": ["openai/gpt-4o-mini", "anthropic/claude-haiku-4-5"],
14 "models_succeeded": 2,
15 "candidates": [
16 {
17 "model": "openai/gpt-4o-mini",
18 "vendor": "openai",
19 "status": "ok",
20 "text": "That model's full answer.",
21 "usage": {"total_tokens": 241}
22 },
23 {
24 "model": "anthropic/claude-haiku-4-5",
25 "vendor": "anthropic",
26 "status": "ok",
27 "text": "That model's full answer.",
28 "usage": {"total_tokens": 317}
29 }
30 ]
31 }
32}
FieldDescription
outputFinal merged answer, in the standard /v1/responses shape
fusion.models_succeededNumber of panel models that contributed
fusion.candidatesEach panel model’s answer, vendor, status (ok or error), and usage; empty when include_candidates is false
usageRolled-up totals across every panel call and the judge

Billing

You are billed for every underlying call: each panel model and the judge, metered the same way as any other request. The top-level usage is the sum across all of them, so a three-model panel is roughly four calls. Budget accordingly.

Errors

StatusCodeCause
400fusion_config_requiredmodel is fusion but no fusion block was provided
400fusion_stream_unsupportedstream was true; fusion is non-streaming in v1
400empty_inputinput is empty
422Validation erroranalysis_models has fewer than two distinct models
502Provider errorEvery panel model failed, or the judge call failed

Limitations

Fusion v1 is non-streaming, requires an explicit analysis_models panel (there is no curated default), treats preset as advisory, and does not run per-panel web search or tools.