Guide

Best starting models for Agentic workflow, priced per call.

An orchestrator delegates repeated search and tool work to cheap subagents, then synthesises what they find. The fan-out is where the calls multiply.

The mismatch is the point: the money sits on the small looping subagent step, while the capability sits on the separate orchestrator. The cost-driver step and the capable-model step are different models.

Subagent fan-out multiplies the cheap calls.
The looping search step carries most of the spend.
The orchestrator runs less often but needs capability.

The pipeline

A feature is a chain of calls, each with a different job. Steps run top to bottom.

01

orchestrate / plan

decide the next move and synthesise subagent results

Frontier capable-model step

per-call shape 3K sys + 9K in + 1.5K out

cheap default Claude Sonnet 4.6 ≈ $0.058 per call

step-up for quality Claude Opus 4.8 ≈ $0.098 per call

open-weight option Qwen 3.7 Max ≈ $0.041 per call
See all frontier-tier models in the price table
02

subagent search

fan out cheap exploration/tool calls, repeated many times

Small repeats cost-driver step

per-call shape 800 sys + 4K in + 400 out

cheap default Claude Haiku 4.5 ≈ $0.0068 per call

step-up for quality Gemini 3.5 Flash ≈ $0.011 per call

open-weight option Mistral Small 4 ≈ $0.0006 per call
See all small-tier models in the price table
03

final answer

produce the user-facing result from gathered evidence

Mid

per-call shape 1K sys + 6K in + 700 out

cheap default Claude Haiku 4.5 ≈ $0.010 per call

step-up for quality Claude Sonnet 4.6 ≈ $0.032 per call

open-weight option Llama 4 Maverick ≈ $0.0015 per call
See all mid-tier models in the price table

How to choose for Agentic workflow

An orchestrator delegates repeated work to cheap subagents, then a final step synthesises it. The cost-driver step and the capable-model step are different, and the mismatch is sharp: the money sits on the small, looping subagent search step, while the capability sits on the separate orchestrate / plan step.

Put the capable model on orchestrate / plan and keep subagent search on a small model, because it loops many times and carries most of the spend. Cutting the number of fan-out calls moves the bill more than upgrading any single model. Reach for a bigger subagent only when cheap exploration keeps coming back wrong.

The takeaway

The cost-driver step is subagent search. The capable-model step is orchestrate / plan. They are different, so put the capable model on orchestrate / plan and keep the rest small.

No fabricated bills, no rankings.

Go deeper

Explainer See the full cost breakdown What this task costs and why, worked through line by line with live prices. Price table Every model, priced per 1M tokens Sort and filter the full catalog the options above link into.

All tasks in the guide

The pipeline

orchestrate / plan

subagent search

final answer

How to choose for Agentic workflow

The takeaway

Go deeper