Best starting models for Classification / extraction, priced per call.
A document goes in, a label or a small JSON object comes out. Moderation, routing, tagging, field extraction. The output is a rounding error.
With output negligible, the input rate times volume is the whole bill, so tier choice is the only lever that moves it. Small models are purpose-built for this.
- Input rate times volume is the entire bill.
- Output is a rounding error.
- Tier choice is the only lever that moves it.
The pipeline
A feature is a chain of calls, each with a different job. Steps run top to bottom.
-
01
classify / extract
label the document or pull structured fields to JSON
per-call shape 300 sys + 800 in + 5 out
How to choose for Classification / extraction
One step, classify / extract, runs at volume: a document in, a label or small JSON object out. There is no capable-model step. Output is a rounding error, so the input rate times your volume is effectively the entire bill.
Tier choice is the only lever that moves it, and small models are purpose-built for this, so start small and let an eval tell you whether you can keep it. Watch cost per accepted result, not cost per call: a cheap model that mislabels and forces a human review is not cheap. These jobs rarely care about latency, so batch pricing often stacks on top.
The takeaway
No step here needs a frontier model. The bill concentrates on the cost-driver step (classify / extract); a small model handles it.
No fabricated bills, no rankings.