Using Truthmark to Improve Loop Engineering: A Fact Layer for AI Coding Agents

A developer introduced Truthmark, a Git-native fact layer for AI-assisted coding, to address behavioral drift in AI development loops. Truthmark maintains human-reviewable documentation in the repository, enabling loop engines to compare code changes against documented truth rather than just tests. This approach makes unwanted behavioral drift visible and improves control over AI coding agents.

AI coding agents are very good at changing code. That is also the problem. A modern AI development loop often looks like this: prompt → code change → test → fix → refactor → test → add feature → fix again → handoff This loop can move fast. It can also quietly move the product. A loop engine may continuously modify and add code. Each individual change can look reasonable. Tests may still pass. The diff may not look dangerous. But across many iterations, the actual behavior of the project can shift in subtle ways: a timeout changes, an edge case disappears, a product constraint gets relaxed, an API starts accepting something it should reject, or an architectural boundary becomes unclear. This is where Truthmark is useful. Truthmark adds a repository-level “fact layer” for AI-assisted development. Instead of relying on chat history, hidden memory, or a reviewer’s ability to reconstruct intent from raw code, Truthmark keeps human-facing, Git-reviewable truth documents in the repository. Its stated goal is simple: agents write code; Truthmark maintains documentation that humans can review in Git. github.com https://github.com/merlinhu1/truthmark Loop engineering is not just “make the agent code until tests pass.” In real teams, a good loop must control three things: Most agent loops focus heavily on the first point. They run tests, read errors, patch code, and repeat. That is useful, but incomplete. Tests are not the full product truth. They usually encode known examples, not the whole behavioral contract. Code review also has limits, especially when the reviewer receives a large diff generated through several AI iterations. The reviewer can see what changed, but not always why it changed, whether the change was intended, or whether it conflicts with earlier product decisions. AI chat history does not solve this. Chat history is often private, ephemeral, long, or detached from the branch. A future agent session may not see it. A teammate reviewing the PR may not see it. CI certainly does not treat it as a source of truth. Truthmark addresses this gap by putting the project’s maintained truth inside the repo, routed to code areas, and reviewable as ordinary Git diffs. The repository describes Truthmark as Git-native, branch-scoped, local-first, and reviewable, with documentation that moves with the branch instead of living in a private session. github.com https://github.com/merlinhu1/truthmark The strongest argument for Truthmark in loop engineering is not “it generates documentation.” The stronger argument is: AI loops constantly change the codebase. Truthmark makes unwanted behavioral drift visible. A loop engine may modify implementation details many times. Some changes are intentional. Some are accidental. Some are “technically working” but product-wrong. Without a fact layer, the loop can only compare the new state against tests, type checks, lint rules, and the current prompt. With Truthmark, the loop also compares the new state against maintained repository truth. That changes the nature of the loop. Instead of: Did the tests pass? The loop can ask: Did the behavior still match the documented product and engineering truth? If it changed, was that change intentional? Should the truth docs change, or should the code be corrected? That is a much better control system for AI coding. Truthmark installs a workflow layer into the repo. The typical post-change flow is: agent modifies functional code agent runs relevant tests Truthmark checks mapped documentation agent updates truth docs if repository truth changed human reviews code diff + truth-doc diff This mirrors Truthmark’s documented finish-time workflow: code, test, check, document, review. github.com https://github.com/merlinhu1/truthmark The important point is that Truthmark is not meant to replace tests or code review. It gives those workflows a durable place to land in Git. The repo documentation explicitly says Truthmark does not replace prompts, memory, specs, tests, or code review; its lane is to make repository truth explicit, route it to code, install agent guidance around it, and keep the result reviewable in Git. github.com https://github.com/merlinhu1/truthmark That distinction matters. Truthmark is not a magic correctness engine. It is an observability and governance layer for AI-assisted development. A minimal setup looks like this: cd /path/to/your-repo npm install -g truthmark truthmark config Then configure the AI host you use: version: 2 platforms: - codex or: claude-code, github-copilot, opencode, antigravity, cursor truthmark: workspace: docs/truthmark generated: portal: enabled: false Then initialize and validate: truthmark init truthmark check git diff Truthmark’s README shows this as the quick-start path and notes that configured platforms can include Codex, Claude Code, GitHub Copilot, OpenCode, Antigravity, and Cursor. github.com https://github.com/merlinhu1/truthmark Once initialized, the repo contains the contract that agents should follow. This is important: the agent does not depend on a background Truthmark daemon. Truthmark’s guide says the CLI runs locally against the active Git worktree, while agent-run workflow surfaces are committed files that agent hosts can load later from repository state. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md That makes the workflow branch-local and reviewable. For loop engineering, treat Truthmark as a finish-time gate and a review amplifier. A good AI loop becomes: 1. Ask the agent for a bounded change. 2. Agent edits code. 3. Agent runs relevant tests. 4. Agent performs Truth Sync before handoff. 5. Agent reports: - code diff - test evidence - truth-doc diff, if needed - routing changes, if needed - skipped truth changes, with reason 6. Human reviews code + truth together. Truthmark’s user guide describes this review shape: a good handoff should show code diff, test evidence, truth-doc diff if needed, routing changes if needed, and an agent report. Reviewers should be able to answer which truth docs own the code, whether those docs needed updates, whether the agent stayed within write boundaries, and whether verification evidence is included. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md That is a higher-quality handoff than “tests pass, here is the diff.” Imagine your product truth says: Session Timeout Users are signed out after 30 minutes of inactivity. A warning is shown 2 minutes before automatic sign-out. Extending the session requires an explicit user action. Now an AI loop is asked to “simplify auth middleware and fix flaky timeout tests.” During the loop, the agent changes the timeout from 30 minutes to 60 minutes because it makes a test easier to stabilize. It also removes the warning modal because the modal path is awkward to mock. The tests pass. Without Truthmark, the reviewer may only see a middleware diff and some test changes. The product regression is easy to miss. With Truthmark, the relevant code area is routed to a canonical truth doc. The agent has to reconcile the code change against the truth layer before handoff. At that point, the loop must make the drift explicit: Truth Sync result: - Code changed session timeout from 30 minutes to 60 minutes. - Existing product truth says timeout is 30 minutes. - No user/product decision requested this behavior change. - Recommendation: revert code to 30 minutes or request product approval and update truth docs. That is the real value: not documentation for its own sake, but a fast signal that the AI loop moved the product. Product managers should care about this. AI agents do not only change implementation. They often make small product decisions while coding: In a human-only team, these decisions often surface in planning, tickets, design docs, or PR discussions. In AI-assisted loops, they can happen inside the code generation process. Truthmark helps by separating product truth from engineering truth. Its configuration model includes product truth under product/ and engineering truth under engineering/ , with routing that maps code surfaces to canonical truth docs. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md This matters because not every implementation change is a product change. A cache refactor may only need engineering truth. A new user-visible capability boundary should update product truth. Truthmark’s installed workflow runtime says Truth Sync makes a product-truth decision before canonical truth writes: product truth should be updated or routed when a user-visible promise, capability boundary, API contract, acceptance criterion, or explicit user/product evidence changed; internal implementation changes default to engineering truth. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/truthmark/engineering/workflows/installed-workflow-runtime.md That gives teams a useful rule: If the AI loop changes what users can observe, product truth must be considered. If it only changes how the system realizes the behavior, engineering truth may be enough. AI agents often struggle when a repo is large and ownership is implicit. They ask: where is the source of truth? Which docs matter? Which tests should I run? Is this behavior owned by frontend, backend, billing, auth, or platform? Truthmark’s routing system gives agents explicit ownership boundaries. Its guide says routes tell the agent which code surface belongs to an area, which truth docs own that area, when truth should be updated, and what kind of truth doc is involved. It also warns that good routing gives Truth Sync precise destinations, while bad routing makes agents guess. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md That is a direct loop-quality improvement. When routing is clear, the agent can constrain its work: Changed files: - src/auth/session.ts - tests/auth/session.test.ts Mapped truth: - docs/truthmark/product/capabilities/session-timeout.md - docs/truthmark/engineering/behaviors/authentication.md Required loop closeout: - Run auth tests. - Check whether product timeout behavior changed. - Update engineering truth if implementation behavior changed. - Report if product truth was unchanged and why. This reduces random doc edits and prevents the agent from creating shadow documentation in the wrong place. A lot of AI engineering tooling optimizes for agent autonomy. That is useful, but teams also need human confidence. Truthmark improves the developer experience by making the handoff easier to review. Instead of asking reviewers to infer intent from a diff, it gives them a parallel truth diff. A reviewer can now ask: Does the code diff match the truth diff? Did the agent update docs because behavior really changed? Did the agent avoid changing docs when behavior did not change? Are product decisions dated and located in the owning doc? Are engineering claims backed by code or tests? That creates a better review experience for engineers, product managers, tech leads, and future agents. It also improves future AI sessions. The next agent does not need to rediscover the product from a stale chat. It can read the branch’s maintained truth docs and follow the repo-local workflow instructions. The most important commands are: truthmark config truthmark init truthmark check For more advanced loops: truthmark index truthmark impact --base main truthmark workflow status --workflow truthmark-sync --base main --json Truthmark’s user guide describes truthmark check as validation for configuration, authority, routing, frontmatter, links, branch scope, generated surfaces, freshness, and coverage diagnostics. It also describes truthmark impact --base <ref as a way to map changed files to routed truth docs, owning routes, and nearby tests. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md That makes truthmark impact --base main especially useful in PR loops. It can help answer: What truth docs are affected by this branch? Which routes own the changed files? Which nearby tests should the agent consider? Do not start by documenting the entire repo. Start with one loop-sensitive area: Then ask your agent to document existing behavior: /truthmark-document document the implemented session timeout behavior across src/auth/session.ts, src/auth/middleware.ts, and tests/auth/session.test.ts Truthmark’s guide recommends using Truth Document when implementation already exists but repository truth is incomplete, and starting with one bounded feature or area at a time. It also states that Truth Document writes truth docs and routing only; it must not change functional code. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md That is a clean adoption path. First make the current behavior explicit. Then let future loops keep it honest. Truthmark is not a replacement for tests. It does not prove the code is correct. It does not replace product judgment. It does not remove the need for code review. The project documentation is explicit about these boundaries: Truthmark is not a hosted service, MCP server, vector database, CI enforcement product, autonomous code rewrite engine, hidden memory layer, or replacement for tests, code review, or human judgment. github.com https://github.com/merlinhu1/truthmark/blob/main/docs/user-guide.md That limitation is also a product strength. The best AI engineering systems do not pretend one layer can solve everything. They combine layers: prompts for intent tests for executable correctness types and lint for mechanical correctness Truthmark for repository truth Git review for human judgment Truthmark’s job is narrow and valuable: keep repository truth visible, branch-scoped, and reviewable while agents keep changing code. The next phase of AI coding is not just faster generation. It is controlled iteration. The teams that win with AI agents will not be the teams that generate the most code. They will be the teams that build the best loops: fast enough to ship constrained enough to trust observable enough to review durable enough to continue Truthmark helps with the second, third, and fourth points. It turns “the agent changed some code” into “the agent changed code, ran checks, reconciled the change against repository truth, and produced a reviewable handoff.” For loop engineering, that is a major upgrade. Because the real failure mode is not that AI writes code. The failure mode is that AI quietly changes what the project means. Truthmark gives that meaning a place to live.