A lot of conversations about AI coding agents focus on obvious failures: hallucinated APIs, broken tests, bad assumptions, or code that simply does not run.
But I think one of the bigger problems is quieter: the agent completes the task, the app still works, the tests may even pass, but the repo is now more disordered than it was before.
That is repo drift. Or, more specifically: repo entropy.
It shows up as bloated files, duplicate helpers, cosmetic modularity, stale scaffolding, local patches that solve one surface while creating inconsistency somewhere else, and custom under-the-hood code that works around the framework instead of working with it.
This is one of the biggest sources of drift I see in AI-assisted development.
The agent is not automatically an expert in your software
Developers often assume that if an AI coding agent is working inside a repo, it understands the software stack the way an experienced maintainer would. But that is not always true.
The agent may know a framework in a general sense. It may have seen thousands of examples. It may be good at producing plausible code quickly. But that does not mean it knows the current version of the framework, the repo’s actual architecture, the project’s preferred patterns, the latest official guidance, the existing abstractions, which files are canonical, which patterns are deprecated, or how the software “wants” to be extended.
That last point matters more than people think. Every mature stack has a grain. There is a way the framework expects state, routing, forms, validation, assets, tests, configuration, and data flow to move through the system.
When an agent does not follow that grain, it often starts inventing custom code to bridge gaps it does not understand. That custom code may solve the immediate task, but it creates drift.
A coding agent usually does not create drift because it is trying to be reckless. It creates drift because it is trying to be helpful with incomplete grounding.
It sees a problem and patches around it. It sees a missing helper and creates one. It sees an awkward interface and adds another layer. It sees a failing test and adjusts the test. It sees a framework constraint and writes custom logic instead of checking whether the framework already has a native path for that problem.
Each move can look reasonable locally. The danger is the accumulation.
One helper becomes three. One workaround becomes a pattern. One bloated file becomes the place where everything gets added. One “temporary” scaffold becomes part of the architecture.
This is how a repo slowly stops matching its own design.
Better prompts help, but they are not enough. The bigger repair is forcing the agent to work from the repo’s actual baselines.
Not vague instructions like: "Use best practices"
But concrete guidance like:
A lot of drift can be avoided simply by updating the coding agent to work within the software’s actual benchmark guidance instead of letting it improvise.
In other words: do not let the agent invent the architecture if the software already has one.
This is the distinction I keep coming back to. An agent can complete a task and still make the repo worse.
It can fix the bug and bloat the file. It can pass the test and weaken the design. It can add a feature and duplicate an existing abstraction. It can satisfy the prompt and violate the project’s baseline.
So the question should not only be:
Did the agent finish?
It should also be:
Did the repo become more trustworthy or more entropic after the agent touched it?
That is where I think the next layer of AI-assisted development has to go: not just more autonomous agents, longer context, or better code generation, but better repo-local supervision.
We need diagnostics that can see what changed, whether the change stayed inside the task boundary, whether verification actually ran, whether files became bloated, and whether the repo still matches its own truth after the work is done.
Because the hardest AI coding failures are not always the ones that break immediately. Sometimes the agent succeeds — and leaves disorder behind.