{"slug": "three-loops-no-ship", "title": "Three Loops, No Ship", "summary": "A developer spent three iterations building an auto-fix pipeline that still only works reliably on trivial tickets. The pipeline, which pulls tickets from Azure DevOps, runs them through a local model, and pushes fixes via a coding agent, succeeded 40% of the time initially and only reached 55% after improvements. The developer learned that local models hit working memory limits before quality degrades, and that adding features merely moves reliability gaps rather than fixing them.", "body_md": "I spent three iterations on an auto-fix pipeline that still doesn't work reliably. Here's what I learned.\n\nWrote a background script. Pull tickets from Azure DevOps, run them through a local model, hand to a coding agent, push the result.\n\nPoll → triage → fix → push.\n\nWorked 40% of the time on trivial tickets. Anything that crossed file boundaries or needed real context — stalled or hallucinated.\n\nI shipped it anyway. That was naive.\n\nMade it smarter. Pre-selected relevant files. Broke big tickets into subtasks. Turned complex edits into atomic steps with verification between each.\n\nGot it to 55% or so. But every fix created two new edge cases. The complexity was compounding faster than the reliability.\n\nWent all in. Embeddings for dedup. Multi-repo routing. Auto-revert. A learning loop that fed failures back into future runs.\n\nThe model server started dying. 890 memory errors in a day.\n\nRoot cause: two independent consumers hitting the same local model server, each with its own retry loop. When memory filled up, retries amplified instead of staggering. The system was making itself worse.\n\nFixes were simple in hindsight — stop retrying OOM, serialize access, use the local binary not npx. But the pattern kept repeating: add more to fix the last thing, break something else.\n\nThe pipeline still only works on easy tickets. Hard ones need a human. After three rounds, the main thing I learned is that local models hit a wall before your ambition does — not in quality, in working memory.\n\nAnd adding features doesn't fix reliability gaps. It just moves them around.\n\nThe 507 retry spiral taught me more than any successful deploy this year. Because it was entirely my fault. Not the model's, not the framework's. I built concurrent consumers with independent retry loops and expected them to coordinate. They didn't.\n\nI'll do a fourth loop. Smaller. A dedicated fast model for cheap work, the big model only for editing. One consumer at a time.\n\nMight work. Might be loop 5's prologue.\n\n**I'm looking for people building similar things.** Local agent pipelines, auto-fix loops, small-model orchestration — the stuff that's not quite working yet but you keep iterating on.\n\nNo Slack. No Discord. No newsletter. Just people who build this stuff and want to compare notes.\n\n**What media would you gravitate around?** A private GitHub org? A Telegram group? Occasional calls? Reply or find me — curious what works.\n\n*Failure post, not a success story. If you're building something similar — don't retry OOM, serialize your consumers, and measure what your model server can actually hold.*", "url": "https://wpnews.pro/news/three-loops-no-ship", "canonical_source": "https://dev.to/vystartasv/three-loops-no-ship-2pg0", "published_at": "2026-06-25 21:59:12+00:00", "updated_at": "2026-06-25 22:33:43.739048+00:00", "lang": "en", "topics": ["developer-tools", "machine-learning", "large-language-models", "ai-agents", "mlops"], "entities": ["Azure DevOps"], "alternates": {"html": "https://wpnews.pro/news/three-loops-no-ship", "markdown": "https://wpnews.pro/news/three-loops-no-ship.md", "text": "https://wpnews.pro/news/three-loops-no-ship.txt", "jsonld": "https://wpnews.pro/news/three-loops-no-ship.jsonld"}}