{"slug": "mcp-tool-that-catches-ai-agent-scope-creep", "title": "MCP tool that catches AI-agent scope creep", "summary": "A new open-source tool called Overreach detects when AI coding agents add unauthorized code—such as endpoints, dependencies, or cron jobs—beyond the original prompt. The tool compares a developer's prompt with the resulting diff using LLM-based scope extraction and deterministic parsing, flagging scope creep with severity levels. It runs as a CLI or pre-commit hook, supporting multiple LLM providers or a fully offline mode.", "body_md": "A standalone MCP tool that catches AI-agent scope creep.\n\nYou give it the **prompt** you gave your coding agent, and the **diff** it produced.\nOverreach tells you whether the diff stayed inside what the prompt asked for — or\nwhether the agent quietly added an endpoint, a dependency, an env var, or a cron job\nthat you never asked for.\n\n\"turns out my ai assistant had been extremely making product decisions without me\"\n\n**Node.js 18+**—[nodejs.org](https://nodejs.org). Verify with`node -v`\n\n.**npm** comes with Node.js. Verify with`npm -v`\n\n.**Git**— required for the pre-commit hook and`git diff`\n\npiping.\n\n```\nnpx -y -p overreach overreach-cli demo\n```\n\nRuns the real pipeline on a sample diff — no API key, no setup, costs nothing.\nExits `1`\n\nwith a `HIGH`\n\nscope-creep finding (the demo prompt asks for a login form;\nthe diff smuggles in Stripe, an env var, an endpoint, and a cron job). That's the\nwhole product in one command.\n\nA diff is flagged when it adds something the prompt never authorized:\n\n| Finding kind | Caught when the diff adds… |\n|---|---|\n`scope.dep` |\na package/requirement the prompt didn't name |\n`scope.env` |\nan env var (`process.env.X` , `os.environ` , `.env` ) |\n`scope.endpoint` |\nan HTTP route / handler / `route.ts` file |\n`scope.cron` |\na cron / scheduler job |\n`scope.file` |\nedits to a file the prompt didn't touch on |\n`scope.feature` |\na new top-level symbol/feature beyond the prompt |\n\nSeverity: env / endpoint / cron = **high** · dep / file = **medium** · feature = **low**.\nOverall `scope_creep_score`\n\n: `HIGH`\n\nif any high finding, `MEDIUM`\n\nif any medium, else `LOW`\n\n.\n\n**Stage 1 — Scope extraction (LLM).** Reads your prompt and produces an`authorized scope`\n\nJSON: which files, features, deps, endpoints, env, and behaviors you actually asked for. Deciphers typos to the nearest real concept but**never invents scope**. This is the only stage that calls a model.** Stage 2 — Diff parsing (deterministic, no LLM).**Regex-parses the diff into the set of things it actually adds — imports, deps,`process.env.X`\n\nreferences, route handlers, cron jobs, new symbols. Runs in milliseconds.**Stage 3 — Comparison (deterministic).** Set arithmetic with fuzzy matching:`actual − authorized = findings`\n\n.\n\nStages 2 and 3 are pure functions — no inference, no opinion, fully auditable. That's what makes Overreach testable without spending a cent on inference.\n\n```\nnpm install -g overreach\n```\n\nOr use directly via `npx`\n\n(no install needed):\n\n```\nnpx -y -p overreach overreach-cli demo\n```\n\nFor best results, set one LLM provider key for Stage 1 scope extraction:\n\n| Provider | Env vars |\n|---|---|\n| Anthropic | `ANTHROPIC_API_KEY` |\nOpenAI / OpenAI-compatible (OpenRouter, Groq, Together, LM Studio, …) |\n`OPENAI_API_KEY` + `OPENAI_BASE_URL` (e.g. `http://localhost:1234/v1` for LM Studio) |\n| Ollama (Cloud or self-hosted) | `OLLAMA_API_KEY` + `OLLAMA_BASE_URL` |\n\nPin a provider/model with `SCOPE_PROVIDER`\n\nand `OVERREACH_MODEL`\n\n.\n\n**No key? No problem.** Without an API key, Overreach falls back to\n**deterministic scope extraction** — it regex-parses your prompt for concrete\nitems (file paths, package names, `/api/...`\n\nroutes, `SCREAMING_SNAKE_CASE`\n\nenv\nvars, cron keywords) instead of calling an LLM. It won't understand vague\ninstructions as well as an LLM would, but it catches every concrete noun in\nyour prompt. Instant, free, fully offline.\n\n```\nnpx -y -p overreach overreach-cli init\n```\n\nThis creates three things:\n\n— write the prompt you gave your agent here`.overreach/prompt.md`\n\n— audits every commit against your prompt`.git/hooks/pre-commit`\n\n— instructs AI agents to self-audit before committing`CLAUDE.md`\n\nEdit `.overreach/prompt.md`\n\nwith the actual instruction you gave your AI agent:\n\n```\nAdd a login form to the settings page with email/password fields,\nform validation, and a submit button that calls /api/auth/login.\ngit add . && git commit -m \"add login form\"\n```\n\nThe pre-commit hook audits staged changes against your prompt:\n\n**HIGH** scope creep → commit blocked (exit 1)**MEDIUM / LOW**→ commit allowed with findings printed- Template prompt (not yet edited) → skipped gracefully\n- No API key → deterministic fallback (extracts concrete items from prompt)\n\nSkip with `git commit --no-verify`\n\nwhen you know what you're doing. Update\n`.overreach/prompt.md`\n\nwhenever you give the agent a new task.\n\nWindows:The pre-commit hook is a shell script. It works out of the box with Git Bash (included with[Git for Windows]).\n\n```\nnpx -y -p overreach overreach-cli --prompt \"add a login form to the settings page\" --diff my-changes.diff\n```\n\nOr pipe a diff:\n\n```\ngit diff | npx -y -p overreach overreach-cli --prompt \"add a login form to the settings page\"\n```\n\nExits `0`\n\nif clean, `1`\n\nif HIGH — usable as a CI gate.\n\nOptions:\n\n`--prompt <text>`\n\n— the instruction that authorized the work`--diff <path>`\n\n— diff file (default: read from stdin)`--scope <path|json>`\n\n— inject authorized scope; skips the LLM entirely`--json`\n\n— emit raw JSON instead of pretty terminal output`--no-cache`\n\n— bypass the scope cache (force a fresh Stage 1 call)`demo`\n\n— run the canonical demo (zero-key)`init`\n\n— install pre-commit hook + CLAUDE.md\n\nOverreach is a stdio MCP server, so any MCP-capable client can connect:\n\n**Claude Code:**\n\n```\nclaude mcp add overreach -- npx -y overreach\n```\n\n**Claude Desktop / Cursor** — add to your MCP config:\n\n```\n{\n  \"mcpServers\": {\n    \"overreach\": { \"command\": \"npx\", \"args\": [\"-y\", \"overreach\"] }\n  }\n}\n```\n\n**Codex CLI** — add to `~/.codex/config.toml`\n\n:\n\n```\n[mcp_servers.overreach]\ncommand = \"npx\"\nargs = [\"-y\", \"overreach\"]\n```\n\nOr Streamable HTTP: set `PORT=8787`\n\nand POST to `http://localhost:8787/mcp`\n\n.\n\nThe HTTP endpoint has no auth.It binds to`127.0.0.1`\n\n(loopback) by default — safe for local use. Donotexpose it publicly (`OVERREACH_HOST=0.0.0.0`\n\n) without an authed reverse proxy in front: anyone who can reach it can call`check_overreach`\n\nand spend your LLM budget.\n\nTools exposed: `check_overreach(prompt, diff, options?)`\n\nand `health`\n\n.\n\n```\n# 1. Register the server with Claude Code (one time)\nclaude mcp add overreach -- npx -y overreach\n\n# 2. Restart your Claude Code session\n#    (a session already open won't see the new server until you quit and reopen it)\n\n# 3. Optionally set an API key (works without one via deterministic fallback)\nexport ANTHROPIC_API_KEY=sk-...     # or OPENAI_API_KEY / OLLAMA_API_KEY\n```\n\nAfter the restart, every new session has `check_overreach`\n\navailable — no per-task\nsetup. The agent calls it when it decides it's relevant.\n\nThe key isn't passed through automatically.The MCP server is a separate process; your agent doesnothand it its own credentials. If you log in to Claude Code with`claude login`\n\n(OAuth / subscription), there's no`ANTHROPIC_API_KEY`\n\nin the environment — so export one (any provider works; local Ollama needs no key), or for Claude Desktop / Cursor add it to the server's`env`\n\n:\n\n```\n{ \"mcpServers\": { \"overreach\": { \"command\": \"npx\", \"args\": [\"-y\", \"overreach\"], \"env\": { \"ANTHROPIC_API_KEY\": \"sk-...\" } } } }\n```\n\n`overreach init`\n\nadds a scope-audit instruction to your project's `CLAUDE.md`\n\nso\nAI agents self-audit their staged changes before committing — no user intervention\nneeded. The agent reads the instruction and runs Overreach on its own diff.\n\nYou can also have the agent call `check_overreach`\n\ndirectly via the MCP server\nwith its own task string + the diff it's about to commit:\n\n```\ngit diff --staged | overreach-cli --prompt \"<the task you just gave me>\"\n```\n\nThis is **best-effort** — an agent can skip the call or ignore the findings\n(fox guarding the henhouse). The hard backstop is the CI gate below.\n\nThe hard backstop. A workflow runs Overreach on every pull request and **fails\nthe PR** when `scope_creep_score=HIGH`\n\n— the diff adds a dep / env var /\nendpoint / cron / out-of-scope file the prompt didn't authorize.\n\nCopy [ .github/workflows/overreach.yml](/Naveja00/OverReach/blob/main/.github/workflows/overreach.yml) into\nyour repo and add\n\n`ANTHROPIC_API_KEY`\n\n(or `OPENAI_API_KEY`\n\n/ `OLLAMA_API_KEY`\n\n)\nas a repository secret. The prompt comes from `.overreach/prompt.md`\n\nin the repo,\nor the PR title + body if that file is absent. The job posts its findings as a PR\ncomment and fails the check on `HIGH`\n\n. Full setup + customization in\n[.](/Naveja00/OverReach/blob/main/docs/ci-gate.md)\n\n`docs/ci-gate.md`\n\n```\n# .github/workflows/overreach.yml  (excerpt)\n- name: Run Overreach\n  run: |\n    npx -y -p overreach@latest overreach-cli \\\n      --prompt \"$(cat \"$RUNNER_TEMP/prompt.txt\")\" --diff \"$RUNNER_TEMP/pr.diff\"\n  env:\n    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n- name: Gate — fail the PR on HIGH\n  if: steps.overreach.outputs.exit == '1'\n  run: exit 1\n```\n\nThis open-source Action is free to run (you bring your own LLM key).\n\n| Model | Result |\n|---|---|\n| Claude Sonnet 4.6 | 82/82 |\n| Claude Opus 4.6 | 65/65 |\n| GLM 5.2 | 82/82 |\n| Kimi K2.7-Code | 82/82 |\n| MiniMax M3 | 81/82 |\n\nThe deterministic fallback (no key) works with any prompt that contains concrete items — no model needed.\n\n```\nnpm test\n```\n\nRuns 56 assertions through the real pipeline with the scope injected via\n`scopeOverride`\n\n, so Stage 1 (the LLM) is never called. Covers overreach\ndetection, clean passes, Python/Express/Next.js parsers, deletion handling,\ndeterminism, chunking, and the trust contract invariant.\n\nOverreach is fully self-contained. It does **not** import or depend on any other\nproject. It reads only its own process environment. No telemetry, no call-home —\nit runs entirely on your machine.\n\nIf Overreach misses something it should flag, or flags something the prompt\nauthorized, open an issue with the **prompt + the smallest repro diff**:\n\n[https://github.com/Naveja00/OverReach/issues](https://github.com/Naveja00/OverReach/issues)\n\nThere's a bug-report template that asks for exactly that.\n\nMIT", "url": "https://wpnews.pro/news/mcp-tool-that-catches-ai-agent-scope-creep", "canonical_source": "https://github.com/Naveja00/OverReach", "published_at": "2026-06-20 11:51:53+00:00", "updated_at": "2026-06-20 12:08:17.871238+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "ai-safety"], "entities": ["Overreach", "Node.js", "npm", "Git", "Anthropic", "OpenAI", "Ollama", "LM Studio"], "alternates": {"html": "https://wpnews.pro/news/mcp-tool-that-catches-ai-agent-scope-creep", "markdown": "https://wpnews.pro/news/mcp-tool-that-catches-ai-agent-scope-creep.md", "text": "https://wpnews.pro/news/mcp-tool-that-catches-ai-agent-scope-creep.txt", "jsonld": "https://wpnews.pro/news/mcp-tool-that-catches-ai-agent-scope-creep.jsonld"}}