Compass – guardrails and a hard budget cap for AI coding agents

Compass, a local-first configuration layer for AI coding agents, introduces guardrails, cost routing, and a hard budget cap to prevent runaway spending and unsafe actions. The tool enforces a dollar limit that halts sessions before the next tool call, blocks catastrophic commands, and provides a measured guardrail policy scored 100/100 in CI. It aims to give developers verifiable control over AI agent behavior.

budget gate · guardrails 100/100 · ~61% cheaper routing · signed releases · 100% local · no telemetry · you always merge Real session, no edits: the cost climbs to $0.35, then the next action is HALTED at the $0.05 cap — before it spends more. compass is a local-first config layer for Claude Code, Codex & Gemini that stops your agent from doing three things it shouldn't — burning your budget, running unsafe commands, and merging unverified code. Set COMPASS MAX USD=5 and the session hard-stops at the cap; catastrophic commands are blocked before they run, and the guardrail policy is scored 100/100 in CI — not asserted. You install it once, and you always merge. no curl|sh, fully reversible — then just open any repo in your agent git clone https://github.com/dshakes/compass ~/compass && cd ~/compass && ./quickstart.sh or, inside Claude Code: /plugin marketplace add dshakes/compass ▶ See it work see-it-work · Why it's different why-its-different--measured-not-vibes · The self-fixing PR loop -the-part-people-screenshot-it-fixes-its-own-prs · Install · What's in the box whats-in-the-box · 📚 Docs /dshakes/compass/blob/main/docs/11-using-compass.md Open a pull request and compass reviews it, security-checks it, runs the tests, cross-audits it with a second model — then pushes its own fixes until it's green. You just merge. The idea in one line: the loop is the unit of work. A one-shot agent stops at its first wrong answer. compass loops — generate → test → critique → fix → repeat against a gate — so quality comes from iteration, not one lucky prompt. The same closed loop runs a single PR, or your whole fleet of repos overnight. Try it locally in 30s, no tokens — watch it ↓. Every AI-agent config claims "safe" and "cheap." compass is the one that hands you the number — and lets a skeptic reproduce it in 30 seconds. Everyone has the same models; the edge is configuration you can trust , not another feature list. Four claims, four commands: 🛡 Guardrails with a score. Catastrophic commands and secret writes are blocked before they run — and the policy is eval-gated, not asserted. In human terms: it won't let the agent delete your machine or leak your keys, and it can prove how well. compass bench → guardrail 100% precision/recall 61-case corpus , router 96.9% — in CI then ask the agent to rm -rf / or write a .env → denied; rm -rf ./build → allowed 📉 Cost routing that's measured. Cheap work goes to cheap models — scored against an eval set, ~61% cheaper than all-Opus at ~98% quality on a fair mix. In human terms: it stops paying Opus prices to fix a typo. compass route "redesign the auth model" → opus compass route "fix a typo" → haiku 💸 A budget ceiling that actually stops it. Usage trackers report spend; compass enforces it — set a dollar cap and the session is halted before the next tool call once it's reached, live. In human terms: an agent can't quietly run up a $40 bill while you're away — it stops at your number. export COMPASS MAX USD=5 this session hard-stops at $5 — the agent is blocked, not just warned compass spend --max-usd 5 the same ceiling on the ledger, for scheduled / fleet runs 🔏 Supply chain you can verify. Releases carry keyless SLSA provenance, so a tampered or look-alike download is rejected. In human terms: you can prove the code you installed is the code I shipped. compass verify v0.17.2 → ✓ provenance verified 🧪 Red-team resistance, measured. Prompt-injection direct/indirect/paste , CLAUDE.md poisoning, local safety-override, malware & insecure-code — scored against a labeled corpus that gates in CI, with optional escalation to a managed guardrails service webhook · Bedrock · Azure . In human terms: a poisoned repo or web page can't quietly turn your agent against you. compass redteam → injection corpus 100% P/R, then scans THIS repo's CLAUDE.md/MCP/settings No service, no telemetry, no --dangerously-skip-permissions ; git pull to update. The work it can't safely own, it hands back — you keep the merge. Smallest leap of faith first — the governance moment , then feel it , then see the proof , then see how it works. 0 · The budget ceiling, annotated — the same hard-stop as the hero clip, as a clean walkthrough $1.80 ✓ → $4.10 ✓ → $5.00 HALTED . Usage trackers report spend; compass enforces it: 1 · The day-to-day feel — guardrails, the cost-aware status line, the loop, and the crew, in ~25 seconds: 2 · The headline, on a real PR — a Blocking bug and red tests, and it pushes its own fix until the PR is green then waits for you : 3 · How that loop works — review · security · tests · Codex cross-audit run in parallel; Blocking findings get auto-fixed and re-reviewed round-capped until green, then it stops at you: Run it locally in 30s with ~/compass/sdlc/orchestrate.sh "<task " no tokens , or wire the GitHub loop for every PR. → how it works · reproduce it And the everyday status line quietly keeps score, so you watch it earn its keep: Opus 4.8 · myrepo · main · 45k ctx · $1.23 · 🧭 🛡1 🧹2 💡1 📉~$1.65 session spend, then today's compass activity: 🛡 footguns blocked · 🧹 files formatted · 💡 policy nudges · 📉~$ estimated saved vs all-Opus. Each piece shows only when there's something to report; nothing leaves your machine. Autonomy here isn't one big magic button — it's the same closed loop applied at four scales. Each runs until a gate says "done," then stops at a human. That's the whole trick: iteration under a gate beats a single confident guess. | Loop | What it drives | Where it stops | | |---|---|---|---| | 🔁 | The task loop | generate → test → critique → fix → repeat — one change driven to green | when tests + review pass | | 🔎 | The review loop | review → auto-fix the Blocking findings → re-review, round-capped ×3 | hands off to a human if still red | | 🛰️ | The fleet loop | the whole pipeline, scheduled across every repo you own, overnight, test-gated | a PR per repo, approve from your phone | | 👥 | The workflow loops | parallel agents that fan out, fact-check each other, and converge | one synthesized answer | Every loop ends the same way — you merge. That gate never moves. | You want… | You need | Tokens? | |---|---|---| The config, guardrails, CLI, subagents local | Claude Code or Codex/Gemini + git | None | The autonomous PR loop GitHub Actions | A repo with Actions + gh , model auth CLAUDE CODE OAUTH TOKEN or ANTHROPIC API KEY , and SDLC BOT TOKEN fine-grained PAT so the loop can chain | Yes | Keyless loop self-hosted runner | A runner labeled compass + SDLC BOT TOKEN | PAT only | The fleet every repo | FLEET TOKEN + FLEET MAINTAINER | Yes | One command wires the GitHub loop: ~/compass/sdlc/setup.sh --all labels + workflows + CODEOWNERS + secrets + branch protection . Without SDLC BOT TOKEN the loop still runs — it just won't auto-re-fire after a fix. → full SDLC setup /dshakes/compass/blob/main/docs/09-sdlc.md Pick the door that fits — all reversible, version-pinnable, no curl | sh. You need an AI assistant Claude Code https://code.claude.com ; Codex/Gemini optional + git . No API keys to get the manual, guardrails, crew, and CLI. 🍺 Homebrew — managed & versioned brew tap dshakes/compass https://github.com/dshakes/compass brew install dshakes/compass/compass latest release · --HEAD to track main compass quickstart previews, asks, then wires it into ~/.claude 📦 Git clone — own & edit your config recommended git clone https://github.com/dshakes/compass ~/compass && cd ~/compass git checkout v0.17.2 optional: pin to a release instead of main ./quickstart.sh previews every change, asks first, fully reversible 🧩 Claude Code plugin — no terminal ideal for a team /plugin marketplace add dshakes/compass /plugin install core@compass 🛠️ By hand: make dry-run preview → make install → make doctor . Symlink install means git pull / brew upgrade updates everything; make uninstall removes only what it added. → Team rollout /dshakes/compass/blob/main/docs/05-plugin.md For every kind of user: a one-line marketplace/extension install no terminal , or make install if you'd rather own the files. Same operating manual + MCP servers, the way each tool expects them: | Agent | Native install no terminal | or own the files | |---|---|---| Claude Code | /plugin marketplace add dshakes/compass → /plugin install core@compass | make install | Codex | codex plugin marketplace add dshakes/compass → /plugin install | make install ~/.codex/AGENTS.md + config.toml | Gemini CLI | gemini extensions install https://github.com/dshakes/compass | ./install.sh --gemini ~/.gemini/GEMINI.md | Cursor · Copilot · OpenCode · Windsurf | read the repo's AGENTS.md | clone + make install | CLAUDE.md · AGENTS.md · GEMINI.md are one file symlinks , and the Claude/Codex plugin manifests + Gemini extension are generated from one source and CI-checked scripts/check-vendor.sh — so a git pull updates every agent at once and a manifest can't drift. The marketplace/extension manifests match each vendor's documented schema and are structure-validated in CI. The live install is manually verified— gemini extensions install gemini 0.26.0 and codex plugin marketplace add codex 0.130.0 both succeed against this repo — but isn't run in our CI those CLIs aren't in the runner . compass doctor validate the install — expect "0 error" compass status is compass active here, and what's loaded? Then just open Claude Code as usual — the manual, guardrails, subagents, commands, and status line are already loaded. Feel it in a minute: ask for a dangerous command blocked , run /review on your diff, or compass route "<task " to see the tier it picks. No tokens, no signup for any of it. Everything below is on after one install or a single opt-in — the autonomous loops above sit on top of this. The README sells; the docs explain — each row links to the detail. | Capability | One line | Deep dive | | |---|---|---|---| | 🔁 | Autonomous SDLC | the review → security → tests → Codex audit → auto-fix → re-review loop; you merge | | The fleet all your repos through a test gate; approve from your phone 14-fleet /dshakes/compass/blob/main/docs/14-fleet.md The crew + workflows 12 /dshakes/compass/blob/main/docs/12-every-agent.md · 13 /dshakes/compass/blob/main/docs/13-workflows.md Guardrails & scanning compass scan , auto-format, keep a JSONL audit log 16-hardening /dshakes/compass/blob/main/docs/16-hardening-and-frontier.md Red-team hardening 17-red-team /dshakes/compass/blob/main/docs/17-red-team.md Cost-tier router router/ /dshakes/compass/blob/main/router The compass CLI onboard · impact · drift · scan · redteam · sandbox · verify · audit-log · spend · dashboard 11-using /dshakes/compass/blob/main/docs/11-using-compass.md MCP + LSP version-pinned MCP servers context7 · fetch · git + opt-in language-server intelligence 04 /dshakes/compass/blob/main/docs/04-mcp.md · 06 /dshakes/compass/blob/main/docs/06-lsp.md Every agent, one source standard https://agents.md/ AGENTS.md 12-every-agent /dshakes/compass/blob/main/docs/12-every-agent.md Live budget ceiling COMPASS MAX USD — enforced, not just reported 02-cost /dshakes/compass/blob/main/docs/02-cost-and-models.md Cost discipline compass spend / impact to see the $ 02-cost /dshakes/compass/blob/main/docs/02-cost-and-models.md Built to be trusted before it's run — and honest about its limits. You own the irreversible. Agents prepare; humans push, merge, deploy. Required checks + a code-owner approval enforce it — there's no "merge to prod" button. Readable & reversible. No curl | sh . The installer backs up what it replaces, is idempotent, and make uninstall removes only what it added. Pin a tag, not main . Guardrails reduce footguns; they are not a security boundary. Keep least-privilege credentials and review your diffs. For untrusted code, compass sandbox is a real boundary. Red-team hardening is defense-in-depth, not immunity. It warns on prompt-injection direct/indirect/paste , CLAUDE.md poisoning, and local safety-override, and refuses to grant project-level safety exceptions — but the cardinal rule external content is data, not instructions and the human gate are what actually hold. compass redteam measures it; see. docs/17-red-team.md What talks to the network. compass phones home to nothing. The auto-registered MCP servers reach non-Anthropic endpoints — context7 → Upstash library docs , fetch → URLs you request; git is local. Hooks are short, commented shell scripts in claude/hooks/ ; disable any via claude/settings.json . Grounded, not invented. Every capability maps to a documented Claude Code / Codex primitive — cited in. docs/07-practices.md Status: alpha.The core — manual, hooks, subagents, commands, MCP, plugin — is stable and dogfooded daily. TheSDLC pipelineis newer: its logic is statically validated in CI and exercised via a smoke-test checklist you run on your own repo — treat it as early. Thered-team layeris new: its detectors are eval-gated in CI precision/recall on a labeled corpus and resist obfuscation compass redteam --attack , but pattern detection is best-effort defense-in-depth, not immunity — and the managed-guardrail adapters are response-parsing contract-tested, with thelive Bedrock/Azure calls unverified in CI need your creds and no live third-party benchmark scores see docs/17 .Dynamic workflowsare a Claude Code research preview. The human merge/deploy gate is permanent, by design. Start here → Using compass — install, the pieces in plain language, the daily workflow. Philosophy /dshakes/compass/blob/main/docs/00-philosophy.md · Architecture /dshakes/compass/blob/main/docs/01-architecture.md · Cost & models /dshakes/compass/blob/main/docs/02-cost-and-models.md · Customize /dshakes/compass/blob/main/docs/03-customize.md · MCP /dshakes/compass/blob/main/docs/04-mcp.md · Plugin & team rollout /dshakes/compass/blob/main/docs/05-plugin.md · LSP /dshakes/compass/blob/main/docs/06-lsp.md · Practices /dshakes/compass/blob/main/docs/07-practices.md · Defaults /dshakes/compass/blob/main/docs/08-defaults.md · SDLC /dshakes/compass/blob/main/docs/09-sdlc.md · Roadmap /dshakes/compass/blob/main/docs/10-roadmap.md · Every agent /dshakes/compass/blob/main/docs/12-every-agent.md · Dynamic workflows /dshakes/compass/blob/main/docs/13-workflows.md · Fleet /dshakes/compass/blob/main/docs/14-fleet.md · Competitive audit /dshakes/compass/blob/main/docs/15-competitive-audit.md · Hardening + frontier /dshakes/compass/blob/main/docs/16-hardening-and-frontier.md · Red-team /dshakes/compass/blob/main/docs/17-red-team.md · Open benchmark /dshakes/compass/blob/main/docs/18-benchmark.md · Provenance /dshakes/compass/blob/main/docs/19-provenance.md · Router module /dshakes/compass/blob/main/router/README.md · ADRs /dshakes/compass/blob/main/docs/adr MIT · built to be shared · contributions welcome