Merge Queues Were Built for Humans. Agents Don't Wait.

wpnews.pro

Mitchell Hashimoto says merge queues fall apart under AI agents. I run a merge-queue company, and he's half right. The queue isn't the thing that breaks.

Mitchell Hashimoto thinks merge queues stop working once you point a swarm of AI agents at your repository. I sell merge queues for a living. He’s half right, and the dishonest move would be to pretend he isn’t.

He said it on the Pragmatic Engineer podcast, in a stretch about Git and monorepos. The setup is a line I’ve been repeating for years: hitting the merge button is the easiest step. Everything expensive happens around the merge, not at it. Getting the code reviewed, keeping main coherent, maintaining forever what you just shipped. Undergraduates can hit the button, as he puts it. The rest is the job.

Then he gets to the queue:

You constantly rebase. Merge queues solves that to a certain extent. I think merge queues works for humans at a certain scale, but merge queues could get quite deep. But then if you sort of 10x that, conservatively, I think 10x that, and then if you buy into hype cycles and you 100 or 1,000x that, I think it gets completely untenable.

(That’s the transcript, grammar and all.)

He’s right about the naive version. A merge queue that exists to serialize rebases (take one PR, rebase it on main, run CI, merge, repeat) was built around human throughput. Humans open a few pull requests a day. The queue could be deep and still drain by lunch. Multiply the inflow by ten, then by a hundred, and a queue that only knows how to go one at a time becomes the exact bottleneck it was supposed to remove.

So yes. The human-scale queue strains under agent churn. Grant it and move on, because the interesting part is what you conclude from there.

What actually breaks #

Hashimoto lists “a confluence of problems”: the merge queue, disk space, and branching/review. I’d pull them apart. Disk space is real (Git clones the world) and it’s a separate fight. The one I care about is the queue, and the thing that breaks there isn’t the queue. It’s the loop the queue sits inside.

He names the failure mode exactly:

Every time you pull, you can’t push, because every time you pull there’s another chain. Every time you push it’s rejected.

The host adds the obvious: there’s a lot of parallel work happening at once.

That’s not a queue failing. That’s contention. Between the moment you decide to integrate a change and the moment main accepts it, the tree moved underneath you. With humans the window was minutes and the traffic was light, so a serial queue could absorb it. With agents the window is unchanged but the traffic isn’t, and the one place every change has to pass through, main, is saturated. (The 100x and 1,000x numbers are his hype-cycle hypotheticals, not measurements. The direction is the point.)

People look at that and conclude the chokepoint is the problem, so rip it out. I’d argue the opposite. Serialization is a throughput constraint, not a law of physics, and the chokepoint is the only place left where you still get to decide what “main” means under that load. You don’t beat contention by deleting your one point of coordination. You beat it by making that point fast, cheap, and parallel. What actually broke is everything that assumed the loop around it was cheap: one PR, one CI run, one human looking at it, merge, move on. Your CI pipeline was never sized for that.

A queue that goes one at a time is the toy version #

Talking about a stealth company he advises, Hashimoto names the real shape:

The amount of churn that these agents is causing is so much greater than humans. And it’s not an AI review problem. It’s really just like a release problem, like managing the merge queues, humans getting access to the right set of data in the repository.

A release problem. That’s exactly it, and release engineering already has answers for “serial doesn’t scale.” We’ve just never had to point them at ordinary feature work, because ordinary feature work never moved this fast.

A queue that takes PRs one at a time and rebases each on the last is the toy. The version that survives agent load does what every high-throughput build system already does:

It speculates. Don’t wait for #1 to land before testing #2. Build the optimistic chain and test #1+#2+#3 in parallel as if they’ll all pass, falling back only when one doesn’t. Zuul has done this for OpenStack for over a decade; GitHub’s own merge queue and Bors do a version of it.It batches. Merge a run of green PRs as one unit and run CI once for the batch instead of once per PR.It bisects on failure. When a batch goes red you don’t drop all of it. You find the offender and let the rest through. Uber’s SubmitQueue and Google’s TAP are built around exactly this.It treats CI cost and latency as the real constraint. Under agent load the binding limit isn’t “can the queue order things.” It’s whether you can afford to run the suite hundreds of times a day, and how fast each run comes back. Speculation and batching pull against that directly: optimistic chains multiply CI spend, and a failure high in the chain throws away everything below it. Flaky tests, which explode under agent churn, wreck the bisection you were counting on. This is the unglamorous core of it, and it has no clever Git answer.

Here’s the honest tension, and the thing Hashimoto’s stealth company is really selling against. These techniques are known. Running them without Google’s build infrastructure and Google’s CI budget is not. That gap is the actual frontier, and it’s why the problem lands on tooling rather than on a team of five wiring up speculative pipelines by hand. (Disclosure, if it wasn’t obvious: I run Mergify, which builds this kind of thing. Discount me accordingly.)

What the queue does not fix #

I’m not going to pretend the queue solves all of it. Green is not coherent. Ten agent PRs can each pass CI alone, pass again as a batch, and still be a mess together: three flavors of the same helper, two designs pulling in opposite directions, a thousand lines no human read. A merge queue keeps main buildable. It does not keep main good. That half of Hashimoto’s worry survives untouched, and it isn’t a release problem, it’s a review problem, and review is exactly the thing agents have made scarcest.

There’s also a future where I lose this bet outright. If someone ships an optimistic, eventually-coherent model of main with no single serialization point to defend, the queue as we know it goes with it. I don’t think that’s where this lands. I think the forge itself bends first.

The most interesting release problem in fifteen years #

The line that stuck with me wasn’t about queues at all. Asked whether Git survives the agent era, Hashimoto said this is the first time in twelve to fifteen years anyone asks the question without laughing.

That’s the signal. The primitives we treated as settled (Git, branches, the pull request, the merge queue) are load-bearing again, and some of them will bend. The merge queue is the one I’d bet on. Not because I sell it, but because the job it does, keeping a shared main coherent while many actors race to change it, gets harder under agents, not easier. You answer more contention by making coordination faster, not by deleting it.

Hitting the merge button is still the easiest step. Agents just turned everything around it into the most interesting problem on the board.

Twenty Years of Stacking Commits

Every code review tool I've used in twenty years has tried to answer the same question. AI just made the answer unavoidable.

[Read more →](/blog/twenty-years-of-stacking-commits)

### [ I Shipped a Rust Binary. I Can't Write Rust. ](/blog/i-shipped-rust-i-cant-write-rust)

A rewrite used to be career suicide. Porting our CLI to Rust took a month of review and a model that knew the language better than I ever will.