cd /news/ai-agents/show-hn-anma-boundary-contracts-for-… Β· home β€Ί topics β€Ί ai-agents β€Ί article
[ARTICLE Β· art-35960] src=github.com β†— pub= topic=ai-agents verified=true sentiment=↑ positive

Show HN: ANMA, boundary contracts for cheaper AI coding agents

ANMA, a new open-source tool, enforces software architecture boundaries for AI coding agents by turning YAML module contracts into Claude Code configuration files and CI checks. In a controlled benchmark, ANMA reduced boundary violations by a cheaper AI model from 13 out of 19 runs to 0 out of 20 runs, offering insurance for running cost-effective agents.

read4 min views1 publishedJun 21, 2026
Show HN: ANMA, boundary contracts for cheaper AI coding agents
Image: source

Boundary enforcement for AI coding agents. ANMA turns plain-YAML module contracts into the CLAUDE.md

, hooks, and checks that keep Claude Code inside your architecture β€” and it measurably works where it matters most.

In a controlled benchmark (Python), a cheaper/faster model (Claude Haiku 4.5) violated a declared module boundary in 13 of 19 runs of a plain repo. With ANMA, across 20 runs of the same task it violated it 0 times (Fisher's exact p < 0.0001

). See docs/BENCHMARKS.md for the full study, including the honest part: a frontier model (Opus 4.8) respected the boundary on its own, so ANMA's value is insurance for running cheaper agents plus a CI/governance guarantee β€” not making a frontier model smarter.

Languages: Python, Go, and TypeScript (language:

in the root anma.yaml

, one per project). Go and TypeScript enforce module→module dependencies; interface (public:

) enforcement is Python-only today. The Go/TS adapters are validated (anma check

  • the hook detect and block real cross-module violations). In a pre-registered follow-up (neutral prompt, harder scenario), TypeScript shows a measured effect β€” control 18/20 vs ANMA 0/20, Fisher's exact p < 0.00001

; Go is directional and significant (10/30 β†’ 0/30, p = 0.0004

) but its control rate fell below our pre-registered 0.40 floor, so we report it as suggestive, not yet efficacy. The Python headline is not extrapolated to either language. Details: CONCEPTS Β§ Languages and BENCHMARKS.

You declare each module's public interface and what it may depend on. anma sync

compiles that into everything else, so the architecture the agent reads can never drift from the rules CI enforces:

anma.yaml                       project config (schema_version, source_roots)
src/domains/billing/
  anma.yaml                     the module contract β€” see docs/CONCEPTS.md for all fields
  CLAUDE.md          (generated) loads when Claude opens billing/
CLAUDE.md            (generated) architecture map, between markers
.claude/rules/boundaries.md (generated) always-loaded imperative
.claude/hooks/anma_pretooluse.py (generated) blocks a boundary-breaking edit (exit 2)
tach.toml            (generated) engine config (Go: .go-arch-lint.yml; TS: .dependency-cruiser.cjs)
.github/workflows/anma.yml (generated) CI: drift check + boundary check
DECISIONS.md         append-only: why each boundary exists
pip install anma[tach]      # tach backend recommended; works without it too
anma init                   # scaffolds contracts + a worked accounts/billing example
anma sync                   # generates CLAUDE.md, nested docs, hooks, tach.toml, CI
anma check                  # βœ“ boundaries respected

For Go or TypeScript, scaffold with anma init --language go

/ anma init --language typescript

(the external backends β€” go-arch-lint

, dependency-cruiser

β€” are optional; a builtin scanner is the zero-dep fallback).

Full walkthrough: docs/QUICKSTART.md.

anma init             # scaffold contracts + a worked example
anma sync             # regenerate all artifacts from contracts
anma sync --check     # CI guard: fail if generated artifacts drifted from contracts
anma check            # enforce boundaries (hook / pre-commit / CI)
anma check --warn     # report violations but exit 0 (incremental adoption)
anma check --json     # machine-readable output for pipelines

Exit codes: 0

ok Β· 1

violations, contract errors, or drift.

ANMA works at two levels, and the benchmark shows they play different roles:

Guidanceβ€” the generated root and per-moduleCLAUDE.md

and.claude/rules

put your architecture in the agent's context. This is what drove the 68% β†’ 0 result: the model was steered to the correct design and didn't attempt a bad edit.Enforcementβ€” thePreToolUse

hook judges theproposededit and returns exit 2 to block any new disallowed import before it lands; the same check runs at pre-commit and in CI. This is the guarantee that holds for the edits guidance doesn't catch, and regardless of which model or human wrote the diff.

The enforcement hook is verified to fire (feed it a forbidden edit β†’ exit 2

); in the benchmark it never needed to, because guidance pre-empted every bad edit. Both matter; see the benchmarks for exactly what each one is shown to do.

  • Teams running cheaper or faster agents(cost-sensitive pipelines, bulk tasks, non-frontier or non-Claude models) that don't reliably respect an architecture on their own β€” this is where ANMA's steering is decisive. - Anyone who wants an enforced architecture: a guarantee in CI/pre-commit that module boundaries hold no matter who or what wrote the change. - Teams that want architecture as governance: declared interfaces, ownership β†’ CODEOWNERS, and docs that can't silently drift from the rules.

If you only ever drive a frontier model on small, well-described tasks, ANMA may add turns without changing outcomes β€” and the benchmarks say so plainly.

~800 lines, no runtime, no DSL, one small dependency (PyYAML) β€” the builtin engine needs nothing more, and the faster external backends (tach

for Python, go-arch-lint

for Go, dependency-cruiser

for TypeScript) are all optional. A security team can read the whole tool in an afternoon.

Drift detectionβ€”anma sync --check

fails CI if generated docs/config fall out of sync with the contracts.Incremental adoptionβ€”anma check --warn

and per-moduledeprecated_deps

let a large codebase adopt without a red build on day one.Governanceβ€”owners:

per module generatesCODEOWNERS

;source_roots:

supports monorepos.Supply chainβ€” signed releases (PyPI Trusted Publishing + provenance + SBOM),pip-audit

in CI, Apache-2.0. SeeSECURITY.md.

docs/QUICKSTART.mdβ€” install to first blocked editdocs/CONCEPTS.mdβ€” the model, thecontract schema reference, generated artifacts, the enginedocs/BENCHMARKS.mdβ€” the with/without study, methodology, and honest limitsCONTRIBUTING.mdβ€” dev setup, tests, the dogfood, the schema-stability ruleSECURITY.mdΒ·RELEASE.mdΒ·CHANGELOG.md

Apache-2.0 Β· ANMA Labs LLC

── more in #ai-agents 4 stories Β· sorted by recency
── more on @anma 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/show-hn-anma-boundar…] indexed:0 read:4min 2026-06-21 Β· β€”