Nemotron 3 Ultra

NVIDIA · Mid

Compare →

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Input / 1M

$0.5

Output / 1M

$2.50

Cached input / 1M

$0.15

Context window

262K

Where it sits

174th-cheapest mid model

by blended $/Mtok among 262 listed mid models

75% above the mid median

blended $/Mtok across 262 mid models

Output costs 5× input

$0.5 in / $2.50 out per 1M

Cached input saves 70%

$0.15 vs $0.5 per 1M fresh

Held flat since launch (Jun 2026)

no blended price change recorded

Computed live from current prices and this model's history — not hand-written, so it stays accurate as prices move.

Price history

Only one price on record so far — the history chart appears once a price changes.

Snapshots

Effective Input Output Cached in Note Source
11 Jun 2026 $0.5 $2.50 $0.15 Imported from OpenRouter openrouter.ai

More from NVIDIA