Written by

Kaia Colban

May 26, 2026

3 min read

Faros AI's 2025 data shows that teams with heavy AI adoption merged 98% more pull requests than teams with low adoption. Same study, two more numbers: PR review time went up 91%, and 22% of merged PRs required post-merge fixes from conflicting concurrent changes.

The first number sells AI tools. The second two are what determine whether your team is actually better off.

What 98% more PRs is hiding

Pull requests are a poor proxy for engineering output, and they've been getting worse for years. They were always a measure of typing speed, not understanding. In 2026 the typing is mostly being done by a model.

So when a team ships 98% more PRs with AI, what is it doing?

It is moving faster. That is real and good. It is also shipping work that is harder for any one human to fully understand, because the work was produced through a back-and-forth between an engineer and a model that lasted 14 turns and is now closed.

The reviewer sees the diff. They do not see the conversation. They make a judgment call in 8 minutes (or 24, given the 91% slowdown) and approve. They are not approving the same artifact they would have approved in 2022. They are approving a more dense, less-internalized one.

The perception gap

METR's 2025 study on AI-assisted development found a 39-point perception gap between expected and actual speedup. Developers expected to get 24% faster with AI. Many believed they got 20% faster even when they were measurably slowing down.

Stack with the Faros data: engineers think AI made them dramatically faster, and the merged PR count agrees with them. The review-time and post-merge-fix data says: the work changed shape, not just volume. Faster to ship. Slower to verify. More to fix later.

This is not an argument against AI. It is an argument against the 98%-more-PRs metric.

The four numbers I'd track instead

1. Time from PR-open to merge. Not the average. The median and the 90th percentile. AI shortens the first 50% (you ship the easy stuff faster). It lengthens the 90th percentile (the gnarly stuff takes longer to review). The shape of that distribution tells you whether your team is shipping more value or just more code.

2. Post-merge fix rate. What percentage of PRs need a follow-up commit within seven days to fix something the original missed? In 2022 this was a noisy number. In 2026, with 22% of high-AI-team PRs needing post-merge fixes, it's a primary signal.

3. Collision rate. How often are two engineers solving the same problem in parallel without realizing it? 47% of developers say they duplicated a teammate's work in the last six months (Stack Overflow 2025). High-AI teams produce collisions more easily because the parallel exploration that used to surface in a hallway now happens privately.

4. Reasoning recoverability. Pick a real architecture decision from 90 days ago. Ask someone who wasn't in the room to find out why it was made. Score them: did they recover it from artifacts in your system, or did they have to interrupt the engineer who made it? The percentage that doesn't require interrupting anyone is your team's institutional memory in the AI era.

What the standard answer misses

The standard answer to too much review is to add more reviewers or relax the bar. Both compound the problem. More reviewers means more cycles to consensus. A lower bar means the post-merge fix rate climbs.

The real answer is to make the reasoning behind each PR visible at review time. If a reviewer can see the session that produced the PR (not just the diff), they review faster and better. The 91% review-time penalty drops. The 22% post-merge fix rate drops. The 98% velocity holds.

This is the leverage point Lore was built for. Sessions get captured automatically. Reasoning gets indexed, not artifacts. Reviewers see the why alongside the what. The how is in the code; the why is in the substrate.

The honest summary

98% more PRs is real, and it's not the whole story
91% longer review and 22% post-merge fix rate are the prices
The trade is favorable if reasoning is recoverable; unfavorable if it isn't
Track PR-open-to-merge distribution, post-merge fix rate, collision rate, and reasoning recoverability instead of PR count

An engineering manager who reads only the 98% number is running a 2022 dashboard on 2026 work. Read the other two numbers in the same study. Then go look at how reasoning lives on your team.