Best starting models for Agentic workflow, priced per call.
An orchestrator delegates repeated search and tool work to cheap subagents, then synthesises what they find. The fan-out is where the calls multiply.
The mismatch is the point: the money sits on the small looping subagent step, while the capability sits on the separate orchestrator. The cost-driver step and the capable-model step are different models.
- Subagent fan-out multiplies the cheap calls.
- The looping search step carries most of the spend.
- The orchestrator runs less often but needs capability.
The pipeline
A feature is a chain of calls, each with a different job. Steps run top to bottom.
-
01
orchestrate / plan
decide the next move and synthesise subagent results
per-call shape 3K sys + 9K in + 1.5K out -
02
subagent search
fan out cheap exploration/tool calls, repeated many times
per-call shape 800 sys + 4K in + 400 out -
03
final answer
produce the user-facing result from gathered evidence
per-call shape 1K sys + 6K in + 700 out
How to choose for Agentic workflow
An orchestrator delegates repeated work to cheap subagents, then a final step synthesises it. The cost-driver step and the capable-model step are different, and the mismatch is sharp: the money sits on the small, looping subagent search step, while the capability sits on the separate orchestrate / plan step.
Put the capable model on orchestrate / plan and keep subagent search on a small model, because it loops many times and carries most of the spend. Cutting the number of fan-out calls moves the bill more than upgrading any single model. Reach for a bigger subagent only when cheap exploration keeps coming back wrong.
The takeaway
The cost-driver step is subagent search. The capable-model step is orchestrate / plan. They are different, so put the capable model on orchestrate / plan and keep the rest small.
No fabricated bills, no rankings.