Market events

Every market event and model launch we've tracked — 30 market events and 301 launches, going back to 2023.

2026

Jun 29, 2026 Market

DeepSeek V4 adds peak-valley pricing

DeepSeek V4 introduces time-based peak/off-peak pricing, varying API token costs by demand.
Source
May 22, 2026 Market

DeepSeek V4 Pro 75% cut

DeepSeek makes a 75% promotional discount permanent, pricing V4 Pro at $0.435/$0.87.
May 19, 2026 Market

Cheap Flash era ends

Gemini 3.5 Flash ships at $1.50/$9 per 1M — triple its predecessor — as Google reprices Flash from budget tier toward Pro territory.
Apr 23, 2026 Market

GPT-5.5 raises frontier prices

GPT-5.5 launches at $5/$30 per 1M, a sharp step up from GPT-5's commodity pricing — frontier labs begin testing price tolerance.

2025

Dec 2, 2025 Market

Mistral Large 3: 75% cheaper

Mistral's open-weight frontier flagship lands at $0.50/$1.50 per 1M — 75% below Large 2 — keeping open-model pressure on closed pricing.
Nov 24, 2025 Market

Opus gets 67% cheaper

Anthropic drops Opus pricing from $15/$75 to $5/$25 with the Opus 4.5 release.
Nov 14, 2025 Market

Qwen3 Max halved in price war

Alibaba cuts Qwen3 Max roughly 50% as China's AI price war reignites, pressuring domestic and global rivals alike.
Aug 7, 2025 Market

GPT-5 sparks price war

GPT-5 launches at commodity pricing ($1.25/$10) — TechCrunch calls it a price-war trigger.
Jun 10, 2025 Market

o3 price cut 80%

OpenAI slashes o3 by 80% ($10→$2/1M input), making frontier reasoning mainstream-affordable.
Apr 14, 2025 Market

Long-context goes cheap

GPT-4.1 lands a 1M-token window at mid-tier pricing, matching Gemini on context economics.
Feb 27, 2025 Market

GPT-4.5: ultra-premium experiment

OpenAI tests $75/$150 pricing with GPT-4.5 — the most expensive API model ever offered. Quickly superseded by cheaper, better models.
Jan 20, 2025 Market

The DeepSeek moment

DeepSeek R1 ships near-frontier reasoning at ~1/20th the price. Markets jolt; pricing pressure spikes industry-wide.

2024

Dec 26, 2024 Market

DeepSeek V3: frontier for cents

DeepSeek releases a GPT-4o-class open model priced around $0.27/$1.10 per 1M, previewing the shock R1 would deliver a month later.
Oct 1, 2024 Market

OpenAI caching: auto 50% off

DevDay brings automatic prompt caching — a no-code 50% discount on recently seen input tokens across GPT-4o and o1 models.
Oct 1, 2024 Market

Gemini 1.5 Pro cut 64%

Google cuts Gemini 1.5 Pro to $1.25/$5.00 per 1M on prompts under 128K — the flagship tier joins the price war.
Sep 12, 2024 Market

o1 creates reasoning price tier

OpenAI's o1-preview launches at $15/$60 — a new pricing category where chain-of-thought tokens make effective costs 2–5× the list price.
Aug 14, 2024 Market

Anthropic ships prompt caching

Cached input tokens cost 90% less on Claude, making long-system-prompt and RAG workloads dramatically cheaper.
Aug 12, 2024 Market

Google slashes Flash 78%

Gemini 1.5 Flash drops 78% on input / 71% on output to $0.075/$0.30 per 1M, undercutting GPT-4o mini by half.
Aug 6, 2024 Market

OpenAI cuts GPT-4o 50%

GPT-4o input drops $5→$2.50/1M and prompt caching arrives — the first big frontier price war shot.
Jul 23, 2024 Market

Llama 3.1 405B opens frontier

Meta releases a GPT-4-class model as open weights; hosted 405B undercuts proprietary frontier pricing and anchors expectations lower.
Jul 18, 2024 Market

GPT-4o mini at $0.15

OpenAI replaces GPT-3.5 Turbo with GPT-4o mini at $0.15/$0.60 per 1M — over 60% cheaper and far more capable.
Jun 20, 2024 Market

Claude 3.5 Sonnet resets value

Anthropic ships a model beating Opus at one-fifth Opus's price ($3/$15), collapsing the gap between mid-tier price and frontier quality.
May 21, 2024 Market

China's LLM price war erupts

After DeepSeek-V2 launched at ~$0.14/1M input on May 6, Alibaba cuts Qwen prices up to 97% and Baidu makes Ernie Speed/Lite free hours later.
May 13, 2024 Market

GPT-4o: frontier at half price

GPT-4o launches at $5/$15 per 1M — half of GPT-4 Turbo's price with better performance, resetting the frontier price bar.
Mar 4, 2024 Market

Claude 3 brings $0.25 Haiku

The Claude 3 family ships with Haiku at $0.25/$1.25 per 1M, staking out the fast-and-cheap tier against GPT-3.5 Turbo.
Feb 15, 2024 Market

Gemini 1.5 Pro: 1M context

Google announces a 1M-token context window, an order of magnitude beyond rivals — long context becomes a $/token battleground.
Jan 25, 2024 Market

OpenAI cuts GPT-3.5 Turbo 50%

Third GPT-3.5 Turbo cut in a year: input drops 50% to $0.50/1M, output 25% to $1.50. New embedding models arrive 5x cheaper too.

2023

Dec 11, 2023 Market

Mixtral 8x7B goes open-weight

Mistral releases Mixtral 8x7B under Apache 2.0 — GPT-3.5-class quality anyone can host, setting a price floor under proprietary small models.
Nov 6, 2023 Market

GPT-4 Turbo: 3× cheaper

GPT-4 Turbo launches at $10/$30 with 128K context, cutting the frontier price by two-thirds and kicking off a year of rapid cuts.
Mar 14, 2023 Market

GPT-4 sets the frontier price

GPT-4 launches at $30/$60 per MTok — 10× the cost of GPT-3.5 Turbo, establishing the first frontier pricing baseline.

No more events.

2026

DeepSeek V4 adds peak-valley pricing

DeepSeek V4 Pro 75% cut

Cheap Flash era ends

GPT-5.5 raises frontier prices

2025

Mistral Large 3: 75% cheaper

Opus gets 67% cheaper

Qwen3 Max halved in price war

GPT-5 sparks price war

o3 price cut 80%

Long-context goes cheap

GPT-4.5: ultra-premium experiment

The DeepSeek moment

2024

DeepSeek V3: frontier for cents

OpenAI caching: auto 50% off

Gemini 1.5 Pro cut 64%

o1 creates reasoning price tier

Anthropic ships prompt caching

Google slashes Flash 78%

OpenAI cuts GPT-4o 50%

Llama 3.1 405B opens frontier

GPT-4o mini at $0.15

Claude 3.5 Sonnet resets value

China's LLM price war erupts

GPT-4o: frontier at half price

Claude 3 brings $0.25 Haiku

Gemini 1.5 Pro: 1M context

OpenAI cuts GPT-3.5 Turbo 50%

2023

Mixtral 8x7B goes open-weight

GPT-4 Turbo: 3× cheaper

GPT-4 sets the frontier price