Compass – guardrails and a hard budget cap for AI coding agents Compass, a local-first configuration layer for AI coding agents, introduces guardrails, cost routing, and a hard budget cap to prevent runaway spending and unsafe actions. The tool enforces a dollar limit that halts sessions before the next tool call, blocks catastrophic commands, and provides a measured guardrail policy scored 100/100 in CI. It aims to give developers verifiable control over AI agent behavior. budget gate · guardrails 100/100 · ~61% cheaper routing · signed releases · 100% local · no telemetry · you always merge Real session, no edits: the cost climbs to $0.35, then the next action is HALTED at the $0.05 cap — before it spends more. compass is a local-first config layer for Claude Code, Codex & Gemini that stops your agent from doing three things it shouldn't — burning your budget, running unsafe commands, and merging unverified code. Set COMPASS MAX USD=5 and the session hard-stops at the cap; catastrophic commands are blocked before they run, and the guardrail policy is scored 100/100 in CI — not asserted. You install it once, and you always merge. no curl|sh, fully reversible — then just open any repo in your agent git clone https://github.com/dshakes/compass ~/compass && cd ~/compass && ./quickstart.sh or, inside Claude Code: /plugin marketplace add dshakes/compass ▶ See it work see-it-work · Why it's different why-its-different--measured-not-vibes · The self-fixing PR loop -the-part-people-screenshot-it-fixes-its-own-prs · Install · What's in the box whats-in-the-box · 📚 Docs /dshakes/compass/blob/main/docs/11-using-compass.md Open a pull request and compass reviews it, security-checks it, runs the tests, cross-audits it with a second model — then pushes its own fixes until it's green. You just merge. The idea in one line: the loop is the unit of work. A one-shot agent stops at its first wrong answer. compass loops — generate → test → critique → fix → repeat against a gate — so quality comes from iteration, not one lucky prompt. The same closed loop runs a single PR, or your whole fleet of repos overnight. Try it locally in 30s, no tokens — watch it ↓. Every AI-agent config claims "safe" and "cheap." compass is the one that hands you the number — and lets a skeptic reproduce it in 30 seconds. Everyone has the same models; the edge is configuration you can trust , not another feature list. Four claims, four commands: 🛡 Guardrails with a score. Catastrophic commands and secret writes are blocked before they run — and the policy is eval-gated, not asserted. In human terms: it won't let the agent delete your machine or leak your keys, and it can prove how well. compass bench → guardrail 100% precision/recall 61-case corpus , router 96.9% — in CI then ask the agent to rm -rf / or write a .env → denied; rm -rf ./build → allowed 📉 Cost routing that's measured. Cheap work goes to cheap models — scored against an eval set, ~61% cheaper than all-Opus at ~98% quality on a fair mix. In human terms: it stops paying Opus prices to fix a typo. compass route "redesign the auth model" → opus compass route "fix a typo" → haiku 💸 A budget ceiling that actually stops it. Usage trackers report spend; compass enforces it — set a dollar cap and the session is halted before the next tool call once it's reached, live. In human terms: an agent can't quietly run up a $40 bill while you're away — it stops at your number. export COMPASS MAX USD=5 this session hard-stops at $5 — the agent is blocked, not just warned compass spend --max-usd 5 the same ceiling on the ledger, for scheduled / fleet runs 🔏 Supply chain you can verify. Releases carry keyless SLSA provenance, so a tampered or look-alike download is rejected. In human terms: you can prove the code you installed is the code I shipped. compass verify v0.17.2 → ✓ provenance verified 🧪 Red-team resistance, measured. Prompt-injection direct/indirect/paste , CLAUDE.md poisoning, local safety-override, malware & insecure-code — scored against a labeled corpus that gates in CI, with optional escalation to a managed guardrails service webhook · Bedrock · Azure . In human terms: a poisoned repo or web page can't quietly turn your agent against you. compass redteam → injection corpus 100% P/R, then scans THIS repo's CLAUDE.md/MCP/settings No service, no telemetry, no --dangerously-skip-permissions ; git pull to update. The work it can't safely own, it hands back — you keep the merge. Smallest leap of faith first — the governance moment , then feel it , then see the proof , then see how it works. 0 · The budget ceiling, annotated — the same hard-stop as the hero clip, as a clean walkthrough $1.80 ✓ → $4.10 ✓ → $5.00 HALTED . Usage trackers report spend; compass enforces it: 1 · The day-to-day feel — guardrails, the cost-aware status line, the loop, and the crew, in ~25 seconds: 2 · The headline, on a real PR — a Blocking bug and red tests, and it pushes its own fix until the PR is green then waits for you : 3 · How that loop works — review · security · tests · Codex cross-audit run in parallel; Blocking findings get auto-fixed and re-reviewed round-capped until green, then it stops at you: Run it locally in 30s with ~/compass/sdlc/orchestrate.sh "