Fusion
Fusion sends a single prompt to several models at the same time, then a judge merges their answers into one consolidated response. Every model in the panel answers the whole prompt independently, and the judge reconciles them. Fusion does not split the prompt into sub-tasks.
Use Fusion when answer quality matters more than latency or cost. A panel of several mid-tier models, reconciled by a judge, often beats any single one of them.
Fusion calls every panel model plus the judge, so a single request bills for all of them. It is also non-streaming in v1: the response returns once the judge has merged the candidates.
How fusion works
- Gateway sends the full prompt to every model in
analysis_modelsat the same time. Each returns a complete, independent answer. - A panel model that errors or is blocked by policy is dropped. Fusion still succeeds as long as at least one model returns a usable answer.
- The surviving answers, plus the original conversation, go to the judge (
synthesis_model), which reconciles agreements, resolves contradictions toward the best-supported claim, folds in unique insights, and writes one final answer. - You receive a standard
/v1/responsesresult whose output is the merged answer, plus afusionmetadata block describing the run.
Make a fusion request
Set model to fusion and include a fusion block on a normal POST /v1/responses request.
The fusion block
Standard generation parameters on the request (max_tokens, temperature, top_p, stop, tags, project_id) are forwarded to each panel model.
Read the response
The output is the judge’s merged answer, in the standard /v1/responses shape. A fusion object describes the run.
Billing
You are billed for every underlying call: each panel model and the judge, metered the same way as any other request. The top-level usage is the sum across all of them, so a three-model panel is roughly four calls. Budget accordingly.
Errors
Limitations
Fusion v1 is non-streaming, requires an explicit analysis_models panel (there is no curated default), treats preset as advisory, and does not run per-panel web search or tools.