cd /news/artificial-intelligence/how-to-keep-ai-coding-agents-from-ha… · home topics artificial-intelligence article
[ARTICLE · art-26894] src=dev.to ↗ pub= topic=artificial-intelligence verified=true sentiment=↑ positive

How to Keep AI Coding Agents from Hallucinating: A Guide to Harness Engineering

Developer Masih Moafi introduced a technique called Harness Engineering to prevent AI coding agents from hallucinating and losing focus. The approach uses structured, repository-local control layers with Markdown-based rules to guide agents, reducing context window pollution and scope drift. Moafi released a public repository of battle-tested configuration templates and demonstrated the method by building a machine learning research project autonomously.

read2 min publishedJun 14, 2026

AI coding agents (like Claude Code, Devin, or open-source equivalents like OpenClaw) are incredibly powerful. They can navigate directories, write tests, refactor modules, and submit PRs.

Yet, if you drop them into a raw repository without boundaries, they suffer from context window pollution, agent amnesia, and scope drift. A simple bug-fix refactor can trigger a 6-hour loop where the agent rewires half the project, deletes unrelated tests, and gets stuck in "process theater."

To fix this, we need Harness Engineering.

An Agent Harness is a structured, repository-local control layer designed to guide and verify the agent's work. Instead of feeding your LLM a monolithic prompt, you embed a lightweight system of record and physical feedback loops directly inside the workspace.

I have packaged the exact, battle-tested Markdown-based context rules I use to steer and constraint my local agents into a public repository: ** MasihMoafi/harnesses-I-use**.

Rather than complex code, this repo shares raw configuration rule sheets:

AGENTS.md

CODEX_CODING_GUIDELINES.md

TERMINAL_AND_GIT_RULES.md

git add -A

), and change safety (using Ubuntu pkexec

for root commands instead of raw CLI password prompts).SESSION_HANDOFF_RULES.md ARTIFACT_RULES.md

abbn.md

ctu

= continue, fmy

= familiarize, ver

= verify) to save token count and maintain short, high-efficiency communication.This harness approach is heavily inspired by Andrej Karpathy's open-source education repos (like ** micrograd** and

Karpathy’s projects are celebrated because they strip away bloat. They focus on clear, reproducible mathematical baselines and avoid over-engineering.

We applied that same philosophy to agent-driven code generation. The core rules of our harness require:

To test this, I used this exact harness to build a comparative machine learning research project: Sensor Fault Diagnosis.

The agent was given a realistic synthetic sensor dataset and tasked with:

By restricting the agent to a single control surface (a structured manual) and enforcing strict keep/discard criteria, the agent completed the pipeline and wrote the final report entirely autonomously.

Without a harness, the agent would have bloated the repository with decorative dashboard scripts or fake performance metrics. The harness kept it grounded.

If you are building code with AI agents, stop writing 2,000-word system prompts. Start building repository harnesses. Check out the templates and configurations:

👉 [MasihMoafi/harnesses-I-use](https://github.com/MasihMoafi/harnesses-I-use)

For more of my work, experiments, and research, check out my website:

👉 [masihmoafi.tech](https://masihmoafi.tech)
── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-to-keep-ai-codin…] indexed:0 read:2min 2026-06-14 ·