Learn › Feature anatomy · June 2026

What an AI feature is actually made of.

A feature is a chain of calls, each with a different job. The step that drives the bill and the step that needs the capable model are usually different ones.

The thing people picture as one model call is almost never one call. A support chatbot classifies the message, retrieves help-centre passages, then writes a reply. A coding agent plans, edits, runs tools, and re-checks across many steps. Each link in the chain has a different job, a different token shape, and a different right model.

Two of those steps matter most, and they pull in opposite directions. One step is the cost-driver step: where the tokens, and so the bill, concentrate. Another is the capable-model step: the one that actually needs a frontier model to get the job right. The common mistake is putting one frontier model across the whole chain because a single step needs it.

The fix is to read the chain step by step. Below are four common shapes, each rendered from the same data the guide uses. For each, the cost-driver step and the capable-model step are named from the chain itself.

Live · prices today

Output costs 4.0× input, on average

Output costs more than input across every provider. Across 283 models the multiple ranges from 0.1× to 12.2×.

input / 1M output / 1M per 1M tokens · tap a row for its history

Live from the index — the per-1M spread every chain below is multiplied against.

That spread sets the stakes for the capable-model step. Today the cheapest frontier model on the index is Llama 4 Maverick, at $0.15 in / $0.6 out per 1M tokens. Put that rate on one step that needs it, not on every step that doesn't.


Support chatbot A multi-turn conversation that routes, retrieves, and replies.
01
intent / route
classify the message and pick a path (FAQ, handoff, tool)
Small
02
retrieve
pull relevant help-centre passages for grounding
Small
03
generate reply
answer in context, re-sending the accumulating transcript each turn
Mid cost-driver step capable-model step

Here the cost-driver step and the capable-model step are the same one: generate reply. Spend there, keep the rest small.


RAG support bot Answer questions over retrieved documents, grounded and citable.
01
embed query
turn the question into a vector to search the index
Small
02
retrieve / rerank
score and order candidate passages so only the best go in
Small
03
generate answer
read the retrieved context and answer without inventing
Mid cost-driver step capable-model step

Here the cost-driver step and the capable-model step are the same one: generate answer. Spend there, keep the rest small.


Agentic workflow An orchestrator delegating repeated search and tool work to cheap subagents.
01
orchestrate / plan
decide the next move and synthesise subagent results
Frontier capable-model step
02
subagent search
fan out cheap exploration/tool calls, repeated many times
Small loops cost-driver step
03
final answer
produce the user-facing result from gathered evidence
Mid

The cost-driver step is subagent search. The capable-model step is orchestrate / plan. They are different steps, so the capable model goes on orchestrate / plan and the rest stay small.


Classification / extraction A document in, a label or small JSON object out. Output is tiny.
01
classify / extract
label the document or pull structured fields to JSON
Small cost-driver step

The bill concentrates on classify / extract. No step here needs a capable model, so a small model runs the whole chain.

The cost-driver step and the capable-model step are usually different.


Where this goes next

The shapes above are the skeleton. Two pages put numbers and choices on them.