{"slug": "i-stopped-blaming-the-model-when-my-ai-broke-working-code", "title": "I stopped blaming the model when my AI broke working code", "summary": "A developer argues that when AI coding agents break existing functionality while fixing bugs, the fault lies not with the model but with the development setup lacking memory of what previously worked. The developer identifies four common failure patterns—blast radius blindness, contract drift, silent safeguard removal, and verification gap—and recommends gating agent behavior through pre-existing protections rather than tuning prompts. The solution involves proving old functionality still works, surfacing blast radius at edit time, and auditing existing safeguards before adding new ones.", "body_md": "Yesterday my coding agent fixed a bug I had been circling for an hour.\n\nThen it quietly broke two things that were working fine before it touched anything.\n\nYou know the loop.\n\nYou ask for a fix. It lands. A test goes red.\n\nYou ask it to fix the test. It fixes the test by undoing the original fix.\n\nNow you are back where you started, more tired, and the agent sounds completely confident the whole way around.\n\nFor a long time I read that loop as the model being a bit dim.\n\nI do not believe that anymore.\n\nHere is the opinion I will defend in the comments.\n\nWhen your AI breaks a working feature while fixing another one, that is almost never the model being stupid. Your setup has no memory of what working meant.\n\nWhat I read this week said it plainly. The problem is configuration, not the model, and the real fix lives in the harness, not the prompt.\n\nThat one line reorganized how I think about the whole thing.\n\nOnce you stop blaming the model, you can name what actually breaks. I sat down and listed every distinct way \"fix one thing, break another\" shows up. It collapses into a handful of shapes.\n\n**Blast radius blindness.**\n\nYou change a shared function or a type, and nothing tells you who else was calling it. It lands three files away, in code the agent never opened.\n\n**Contract drift.**\n\nYou change the shape a function returns, or an API response, and the thing consuming it still expects the old shape. Nobody re-checks the handshake.\n\n**Silent safeguard removal.**\n\nThis one is quiet. Your agent deletes a test, or skips it, or strips a guard that was holding something together. A regression hides inside the very check that would have caught it.\n\n**Verification gap.**\n\nYour agent shows you the change and calls it done. That diff is real. But a diff is not behavior. Nothing re-ran the old feature to confirm it still works.\n\nLook at that list again and notice something.\n\nMost of these breaks share one trait. Something that used to protect you quietly went missing.\n\nThat distinction matters more than it sounds.\n\nEvery linter and scanner you own is good at spotting bad code that is there. They go blind when good code disappears. A deleted test does not trip an alarm. A removed guard reads as a smaller, cleaner diff.\n\nSo tuning the prompt harder will not save you. You can write the most careful instruction in the world and the agent still will not know that one function had four hidden callers, or that one test was the only thing protecting a payment path.\n\nThis fix is boring, and it works. Stop trying to make the model behave, and start gating the behavior around it.\n\nA few principles I now treat as non negotiable.\n\nProve the old thing still works before you call anything done. A green diff is not proof that behavior survived. Run the handful of tests that touch what you changed and read the result, every time.\n\nSurface the blast radius at the moment of the edit, not three hours later in production. If a shared thing changed, you want its callers in front of you right then, while the change is still in your head.\n\nAudit what already protects you before you add anything new.\n\nEveryone skips this step. When I built my own guard for this yesterday, what kept it small was spending the first hour not building. I mapped every one of those failure shapes against the protection I already had. Most were half covered. Tightening what existed beat stacking new tools on top.\n\nThat single habit, audit before you add, separates a system that gets sharper over time from one that turns into noise nobody trusts.\n\nNone of this is exotic. It is the same discipline good teams have always used against regressions, pointed at a new kind of teammate who types fast, sounds sure, and forgets everything between turns.\n\nThink of the model as a very fast junior who has no memory of yesterday. You would not let that person merge to main on confidence alone. Do not let the agent either.\n\nGive the work a way to prove it did not break the thing that was already working. That is the whole game.\n\nWhat is the last thing your AI agent broke while fixing something else?\n\nI work through this in public, the wins and the freezes both, mostly on LinkedIn and YouTube. If the real version of building in the open is useful to you, that is where it lives. [LinkedIn](https://www.linkedin.com/in/mirzajhanzaib/), YouTube and X under Mirza Iqbal, and the work at [next8n.com](https://next8n.com).", "url": "https://wpnews.pro/news/i-stopped-blaming-the-model-when-my-ai-broke-working-code", "canonical_source": "https://dev.to/mjmirza/i-stopped-blaming-the-model-when-my-ai-broke-working-code-46o8", "published_at": "2026-06-15 06:54:53+00:00", "updated_at": "2026-06-15 07:10:53.803286+00:00", "lang": "en", "topics": ["artificial-intelligence", "developer-tools", "ai-agents"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/i-stopped-blaming-the-model-when-my-ai-broke-working-code", "markdown": "https://wpnews.pro/news/i-stopped-blaming-the-model-when-my-ai-broke-working-code.md", "text": "https://wpnews.pro/news/i-stopped-blaming-the-model-when-my-ai-broke-working-code.txt", "jsonld": "https://wpnews.pro/news/i-stopped-blaming-the-model-when-my-ai-broke-working-code.jsonld"}}