Ling-2.6-flash

inclusionAI · Released Apr 2026

Ling-2.6-flash is a 104B parameter model from inclusionAI with 7.4B active parameters, designed for fast inference and efficient token usage in agent applications.

Strengths: The sparse architecture and selective activation provide fast response times and reduced computational overhead compared to dense models of similar scale.
Best for: Real-time agent systems and applications where low latency and token efficiency are priorities.
Limitations: With only 7.4B active parameters, it may struggle with complex reasoning tasks or specialized domains that benefit from larger model capacity.

Input / 1M

$0.01

Output / 1M

$0.03

Cached input / 1M

$0.002

Context window

Price history

Snapshots

Effective	Input	Output	Cached in	Note	Source
11 Jun 2026	$0.01	$0.03	$0.002	Imported from OpenRouter	openrouter.ai

More from inclusionAI

Ring-2.6-1T

in $0.075 · out $0.625

Ling-2.6-1T

in $0.075 · out $0.625

Data updated Jun 29, 2026 Report a problem