Market events
Every market event and model launch we've tracked — 30 market events and 301 launches, going back to 2023.
2026
-
DeepSeek V4 adds peak-valley pricing
DeepSeek V4 introduces time-based peak/off-peak pricing, varying API token costs by demand.
Source -
DeepSeek V4 Pro 75% cut
DeepSeek makes a 75% promotional discount permanent, pricing V4 Pro at $0.435/$0.87.
-
Cheap Flash era ends
Gemini 3.5 Flash ships at $1.50/$9 per 1M — triple its predecessor — as Google reprices Flash from budget tier toward Pro territory.
-
GPT-5.5 raises frontier prices
GPT-5.5 launches at $5/$30 per 1M, a sharp step up from GPT-5's commodity pricing — frontier labs begin testing price tolerance.
2025
-
Mistral Large 3: 75% cheaper
Mistral's open-weight frontier flagship lands at $0.50/$1.50 per 1M — 75% below Large 2 — keeping open-model pressure on closed pricing.
-
Opus gets 67% cheaper
Anthropic drops Opus pricing from $15/$75 to $5/$25 with the Opus 4.5 release.
-
Qwen3 Max halved in price war
Alibaba cuts Qwen3 Max roughly 50% as China's AI price war reignites, pressuring domestic and global rivals alike.
-
GPT-5 sparks price war
GPT-5 launches at commodity pricing ($1.25/$10) — TechCrunch calls it a price-war trigger.
-
o3 price cut 80%
OpenAI slashes o3 by 80% ($10→$2/1M input), making frontier reasoning mainstream-affordable.
-
Long-context goes cheap
GPT-4.1 lands a 1M-token window at mid-tier pricing, matching Gemini on context economics.
-
GPT-4.5: ultra-premium experiment
OpenAI tests $75/$150 pricing with GPT-4.5 — the most expensive API model ever offered. Quickly superseded by cheaper, better models.
-
The DeepSeek moment
DeepSeek R1 ships near-frontier reasoning at ~1/20th the price. Markets jolt; pricing pressure spikes industry-wide.
2024
-
DeepSeek V3: frontier for cents
DeepSeek releases a GPT-4o-class open model priced around $0.27/$1.10 per 1M, previewing the shock R1 would deliver a month later.
-
OpenAI caching: auto 50% off
DevDay brings automatic prompt caching — a no-code 50% discount on recently seen input tokens across GPT-4o and o1 models.
-
Gemini 1.5 Pro cut 64%
Google cuts Gemini 1.5 Pro to $1.25/$5.00 per 1M on prompts under 128K — the flagship tier joins the price war.
-
o1 creates reasoning price tier
OpenAI's o1-preview launches at $15/$60 — a new pricing category where chain-of-thought tokens make effective costs 2–5× the list price.
-
Anthropic ships prompt caching
Cached input tokens cost 90% less on Claude, making long-system-prompt and RAG workloads dramatically cheaper.
-
Google slashes Flash 78%
Gemini 1.5 Flash drops 78% on input / 71% on output to $0.075/$0.30 per 1M, undercutting GPT-4o mini by half.
-
OpenAI cuts GPT-4o 50%
GPT-4o input drops $5→$2.50/1M and prompt caching arrives — the first big frontier price war shot.
-
Llama 3.1 405B opens frontier
Meta releases a GPT-4-class model as open weights; hosted 405B undercuts proprietary frontier pricing and anchors expectations lower.