Every time I started a new agent session, I was re-explaining the same things.
The architecture rules. The patterns to avoid. The decisions I'd already made. The approaches that already failed.
The agent would forget everything and I'd be back to square one.
My first instinct was to write better prompts. Longer, more detailed, more explicit. But that just made the problem worse β now I had a 200-line prompt to maintain, and the agent still forgot it all next session.
The real problem isn't the prompt. Prompts are session-scoped. They disappear.
Instead of putting rules in prompts, what if we put them in the repository itself?
Prompting is temporary. Context is session-scoped. A harness is project-scoped.
The idea is simple: every time an agent makes a recurring mistake, convert it into a durable artifact in the repo instead of re-prompting.
docs/decisions/
docs/failures/
AGENTS.md
The repository gets smarter with every mistake. The agent reads the repo at the start of each session and picks up all the rules automatically β no prompting required.
1. AGENTS.md β Instruction Document
A file the agent reads at the start of every session. Contains project overview, directory rules, forbidden patterns, test commands, and PR behavior. Think of it as a permanent briefing document.
2. Architecture Constraints
Automated rules that block invalid code before it merges. Linters, type checks, import boundaries, pre-commit hooks. If AGENTS.md
says "no direct DB access from routes", add a lint rule that enforces it.
3. Feedback Loops
Signals the agent uses to self-correct. Test failures, CI failures, lint errors. A good harness gives the agent clear, actionable failure messages so it can fix its own mistakes.
4. Knowledge Store ( docs/)
Durable context that survives session resets:
docs/decisions/
β why certain architectural choices were madedocs/failures/
β approaches already tried and rejecteddocs/conventions/
β project-specific coding rulesdocs/domain/
β business terminology and domain knowledge5. Drift Checks
Scripts that detect when the harness itself goes stale. Is AGENTS.md
referencing files that no longer exist? Are there temporary files that never got cleaned up? Drift checks catch this automatically.
I built a starter kit that applies this pattern to any existing project.
Usage:
Clone it inside your target repo:
workspace/
βββ target-repo/
βββ harness-starter-kit/
βββ existing project files
Then give your coding agent this prompt:
Read ./harness-starter-kit first, then apply the harness engineering
starter kit to this repository. Preserve existing architecture, tools,
and conventions. Do not overwrite existing files without explaining why.
Finish with a short adoption report.
The agent inspects your repo, adapts to your existing tools (eslint, tsc, ruff, etc.), and installs only the missing harness files.
Key design decisions:
Better prompting is a local fix. Harness engineering is a systemic fix.
Every recurring agent failure should become at least one durable artifact β a clearer rule, an automated constraint, a test, a decision record, or a drift check. That's the core loop.
The goal isn't to write the perfect prompt. It's to build a repository that makes the same mistakes increasingly unlikely over time.
Have you been dealing with agents repeating the same mistakes? How are you handling it?