The Pragmatic CTO
The Pragmatic CTO Podcast
Audio: Code Review Is the Bottleneck. Time to Rebuild It.
0:00
-3:59

Audio: Code Review Is the Bottleneck. Time to Rebuild It.

Code review is the bottleneck slowing down software delivery, and AI hasn’t fixed it—it’s made it worse. Developers can generate code faster than ever, but reviews still take days. The result: more code created but the same throughput. The review queue is the choke point.

The industry treats code review like a bug hunt, but data shows only 15% of review comments identify defects. The rest are style, structure, or understanding issues—things machines could catch. Senior engineers spend hours on style nits instead of architectural judgment, misallocating valuable expertise and compounding delays.

Beyond what reviewers focus on, how code changes are presented makes a huge difference. Pull requests list files alphabetically, forcing reviewers to mentally piece together related changes scattered across the diff. This increases context switching and fatigue, reducing review quality. Cognitive overload is the mechanism by which defects slip through, not just a minor inconvenience.

AI-generated code adds “verification debt.” When developers write code, understanding comes naturally. When AI writes it, reviewers must rebuild comprehension from scratch. That’s why AI-generated PRs have 1.7 times more issues than human-written ones. With more AI PRs flooding the pipeline and fewer reviewers around, the bottleneck is only widening.

Fixing this requires three shifts. First, change how code is presented—group files by business impact and execution flow so reviewers see the change’s story, not an alphabetical jumble. Second, shift human focus to what only humans can do: architecture, business logic, and design judgment. Automate style and obvious bugs. Third, restructure work into small, manageable batches. Studies show small PRs under 400 lines get reviewed faster and catch more defects. Stacked diffs let teams parallelize review and speed throughput.

AI code reviewers belong in the first automated layer—they catch common issues and speed up the mechanical part of review by 10-20%. But they don’t replace humans. Real gains come from combining AI with better code presentation and smaller PRs. Without all three, the bottleneck remains.

There are limits. Small teams can’t fully separate review layers. Legacy codebases may struggle with grouping changes by business impact. Cultural resistance to changing review’s purpose can slow adoption. And large, complex changes still need thoughtful handling beyond small batches.

I’m building StructPR, a GitHub app that tackles the presentation problem by automatically clustering changed files into logical groups like “Authentication” or “Database Migrations.” Reviewers see the story of a change, not an alphabetical list, reducing cognitive load. It addresses the most overlooked shift because alphabetical file lists have been the default for decades without question.

Ask yourself: How long do your PRs sit before merging? Has that increased with AI code generation? How many review comments flag style issues linters could catch? Do reviewers understand the business intent before diving into code, or are they piecing it together from scattered files? Who’s absorbing the extra AI-generated review load, and what else are they not doing? Your review process either supports understanding or it doesn’t.

The bottleneck in software delivery isn’t typing speed anymore. AI proved that by making code creation almost instantaneous. The bottleneck is understanding—understanding what changed, why, and whether it should have. Code review is where that understanding either lives or dies. It’s time to rebuild it.

You can read the full article—with all the data and sources—on ThePragmaticCTO Substack.


Read the full article — with all the data and sources — on ThePragmaticCTO Substack.

Discussion about this episode

User's avatar

Ready for more?