My AI agent built a feature that already existed 15 times

An engineer discovered that their AI agent built three new workflow rules when asked for one, only to find that fifteen hooks and four skills already performed the same function. The agent failed to inventory the existing system before acting, leading to redundant code. The fix was to force the agent to run 'ls' and 'grep' commands to check for existing mechanisms before writing new code.

I asked my AI agent to add one workflow rule. It built three. Then I ran one grep and found that fifteen hooks and four skills already did the exact same job. TL;DR: An AI agent will confidently build something that already exists, because it does not inventory your system before acting. The fix is not a smarter model. It is forcing a ls and a grep into the agent's path before it writes anything. The request was small: "when a task can't be checked right now, write it down as an issue so it isn't forgotten." The agent did not start by looking at what was already there. It started by building. Within one turn it had: It worked. The tests I asked it to write passed. It reported done. Then I asked the question it should have asked itself first: did anything like this already exist? the question the agent skipped ls ~/.claude/hooks/ .py | grep -iE "defer|completion|issue|residual|todo|handover|stop-" ls ~/.claude/skills/ | grep -iE "issue|todo|inventory|fresh" The output was not empty. It was loud. Fifteen. Four. hooks: 15 skills: 4 Fifteen hooks already dealt with deferral, completion claims, and issue handoff. Four skills issue-inventory , github-issue , search-issues , fresh already covered exactly the "write it down so it isn't forgotten" flow. One of them, github-issue , already had the due: field my agent had just "invented" two labels to approximate. The agent had not extended the system. It had grown a second one next to it. And the two new labels it created — timed-followup , awaiting-reaction — were dead on arrival: does anything actually read these labels? grep -rl "timed-followup" ~/.claude/ → only the files the agent itself just wrote. Nothing consumes them. A label nobody reads is not a feature. It is litter with a colon in it. A human engineer joining a codebase spends the first hour reading. They open the hooks directory. They notice the skill named issue-inventory and think, "that sounds like what I'm about to build." Curiosity and a fear of looking redundant push them to check first. An agent has neither. It is optimized to produce a plausible, working change to the thing in front of it. The path of least resistance is to write new code, not to audit old code. So it builds — fluently, confidently, and on top of work that already exists. This is the same failure people mean when they say agents write code, but they don't remember . The memory that is missing is not the conversation. It is the system. The agent never loaded a map of what was already there, so every request looks like a greenfield. The fix that held was not "use a better model." It was a gate placed before the agent is allowed to write anything new of a given kind: Before creating a new hook, skill, rule, or label, run ls and grep for the same concept. If something already does the job, extend it. If nothing does, then build — and say what you searched. Concretely, the agent's own corrected output now looks like this: inventory step — runs before "build a new mechanism" is allowed existing=$ ls ~/.claude/{hooks,skills} | grep -iE "<concept " if -n "$existing" ; then echo "extend, don't rebuild: $existing" fi The judgment rule that survived was a single binary: is there an existing mechanism for this concept — yes or no? Not "is the new one nicer." If yes, the new build is the wrong move, even when it is cleaner. What I kept from the agent's work was small: one genuinely new decision rule that had no prior owner. Everything else — the new hook file, the new labels, the reinvented due: field — got folded back into the systems that already existed. If you run an AI agent against a system that has accreted over months, point it at one of its own past changes and ask: Before this, what already existed that you could have extended instead? Then run the ls / grep it skipped. The gap between what it built and what was already there is the size of the inventory step it never took. What is the smallest "search before you build" check you can put in front of your agent, before its next confident new file? Sho Naka nomurasan . I run AI agents against my own tooling daily, and I spend most of my week deciding which of their confident outputs to keep and which to fold back into what already existed. This piece was adapted from a Japanese-language essay I wrote, with AI assistance for the cross-language rewrite. The work it describes — and the grep that caught it — are mine.