{"slug": "beast-governed-output-gateway-for-ai-coding-agents", "title": "Beast – governed output gateway for AI coding agents", "summary": "Beast, a governed output gateway for AI coding agents, intercepts inputs and outputs between agents and LLM providers to enforce output contracts and repair non-compliant patches, achieving 100% task completion at under 400 tokens. In tests, Beast rescued 156 of 192 raw provider outputs that were non-compliant, malformed, or incomplete, preventing silent failures and code corruption.", "body_md": "**Governed output gateway for agentic coding tools.**\n\nBEAST sits between your AI coding agent (Cursor, Claude Code, VS Code Copilot) and any LLM provider. It governs what goes *in* and what comes *out* — enforcing output contracts, repairing non-compliant patches before they touch your filesystem, and learning which tool calls are worth making.\n\nAI coding agents are not careful. They read entire files when they need three lines. They write to paths they shouldn't. They spend your token budget on redundant lookups. When a provider returns malformed JSON, they fail silently or corrupt your code.\n\nBEAST intercepts both sides:\n\n**Input governance**— context compression, tool laziness learning, budget enforcement, circuit breakers** Output governance**— every model response is parsed against a typed output contract (`beast.action_intent.v1`\n\n) before anything touches disk. Non-compliant patches are repaired locally and verified. If verification fails, nothing is written.\n\n| Lane | Completed | Median tokens | vs raw |\n|---|---|---|---|\n| Raw (no BEAST) | 0 / 10 | 47,661 | — |\n| Context only | 0 / 10 | 44 | −99.9% |\n| RAG | 8 / 10 | 296 | −99.4% |\n| RAG + Tools | 10 / 10 | 326 | −99.3% |\nFull BEAST |\n10 / 10 |\n390 |\n−99.2% |\n\nRaw context hits the token budget before the model can reason about the scoped problem. BEAST completes 100% of tasks at under 400 tokens, verified by passing pytest suites.\n\n| Result | Count |\n|---|---|\n| BEAST end-to-end completions | 192 / 192 |\n| Clean provider completions | 36 / 192 |\n| BEAST-rescued completions | 156 / 192 |\n\n79% of raw provider outputs were non-compliant, malformed, or incomplete. BEAST rescued every one of them. Without output governance, those 156 tasks would have silently failed or written corrupted patches.\n\n| Rank | Provider | Role | Clean | Fitness | Latency |\n|---|---|---|---|---|---|\n| 1 | `ovhcloud` |\ncandidate patch provider | 5/10 | 0.663 | 14s |\n| 2 | `puter_deepseek` |\ncandidate patch (high latency) | 4/10 | 0.619 | 13s |\n| 3 | `cohere` |\ncandidate patch provider | 4/10 | 0.614 | 6.7s |\n| 4 | `deepinfra` |\ncandidate patch (high latency) | 4/10 | 0.612 | 32s |\n| 5 | `huggingface` |\nrescue-backed action IR | 3/10 | 0.583 | 1.6s |\n| 6 | `nscale` |\nrescue-backed action IR | 3/10 | 0.581 | 7.8s |\n| 7 | `mistral` |\nrescue-backed (Codestral) | 2/10 | 0.545 | 4.1s |\n| 8 | `openrouter` |\nfast rescue-backed action IR | 2/10 | 0.544 | 3.8s |\n| 9 | `sambanova` |\nfast rescue-backed action IR | 1/10 | 0.512 | 3.0s |\n| 10 | `cloudflare` |\nedge / microtask | 1/10 | 0.483 | 2.1s |\n| 11–14 | `cerebras` , `featherless` , `nvidia_nim` , `gemini` |\nscout / selector | 0–2/10 | 0.33–0.42 | varies |\n| 15–16 | `groq` , `llm7` |\nscout only | 0/10 | 0.23 | fast |\n| 17–18 | `aion_labs` , `novita` |\nrate-limited / rescue | 1/10 | 0.39–0.51 | varies |\n| 19–20 | `hyperbolic` , `fal` |\ndo not use (auth/billing) | 0/10 | — | — |\n\n**Notable findings:**\n\n**Puter-routed DeepSeek** achieved 4 clean passes on a free proxied route — matching paid providers. BEAST can make unconventional free routes production-viable through governance.**LLM7** returned valid JSON on 100% of tasks but passed the output schema on only 10%. Without an output governor, it looks like it's working. It isn't.**NVIDIA NIM** failed the output contract on every task. BEAST repaired and rescued both targeted tasks. Zero silent failures.**DeepInfra** observed cost: ~$0.000332 per verified, governed code fix.\n\n```\nCoding agent (Cursor / Claude Code / VS Code)\n        │\n        ▼\n┌─────────────────────────────────────────┐\n│              BEAST Gateway              │\n│                                         │\n│  Input side          Output side        │\n│  ─────────           ───────────        │\n│  Context economy     Output contract    │\n│  Tool laziness       Local verifier     │\n│  Budget ledger       Patch compiler     │\n│  Circuit breakers    Anchor resolver    │\n│  Workspace graph     Repair engine      │\n│  MCP broker          Sandbox validator  │\n│                                         │\n│  Memory: L0 policy → L4 forensic archive│\n└─────────────────────────────────────────┘\n        │\n        ▼\n  Any LLM provider (20+ tested)\n```\n\nEvery model response passes through:\n\n**Contract parse**— response must conform to`beast.action_intent.v1`\n\n**Anchor resolution**—`anchor_ref`\n\nfields resolve to exact code locations; no copy-paste writes**Path validation**— writes outside allowed paths are rejected before compilation** Local patch compile**—`ActionIR`\n\n→`ResolvedAction`\n\n→ staged file writes**Sandbox verification**— compiled patches run against pytest before disk commit** Repair**— if verification fails, the local verifier attempts repair before giving up** Forensic record**— every outcome (clean, repaired, rejected) is written to the Chronicle\n\nProvider-specific output profiles handle model quirks: NVIDIA NIM gets `refs_only=True`\n\n; HuggingFace gets `repair_attempts=2`\n\n.\n\n| Layer | Name | Contents |\n|---|---|---|\n| L0 | Meta Rules | Spend caps, shell allowlists, blocked paths — immutable |\n| L1 | Insight Index | Session state, cache handles, circuit state |\n| L2 | Workspace Graph | Symbol maps, dependency edges, semantic chunks |\n| L3 | Skill Tree | Promoted, verified workflows and route cards |\n| L4 | Forensic Archive | Append-only Chronicle — every request, every outcome |\n\n```\ngit clone https://github.com/Byron2306/EdgeK-BEAST\ncd EdgeK-BEAST\npip install -r requirements.txt\n```\n\nOptional (semantic RAG, large ML wheels):\n\n```\npip install -r requirements-semantic.txt\n```\n\nOptional (LiteLLM proxy support):\n\n```\npip install -r requirements-litellm.txt\n```\n\nStart the gateway:\n\n```\nuvicorn app.main:app --host 0.0.0.0 --port 8005\n```\n\nPoint your coding agent at BEAST instead of your provider directly:\n\n```\n# OpenAI-compatible (Cursor, Claude Code, etc.)\nexport OPENAI_BASE_URL=http://localhost:8005/v1\n\n# Anthropic-compatible\nexport ANTHROPIC_BASE_URL=http://localhost:8005\n```\n\nSet whichever providers you use:\n\n```\nexport HF_TOKEN='...'\nexport HF_INFERENCE_BASE_URL='https://router.huggingface.co/v1'\nexport OPENROUTER_API_KEY='...'\nexport GEMINI_API_KEY='...'\nexport NVIDIA_API_KEY='...'\nexport COHERE_API_KEY='...'\nexport MISTRAL_API_KEY='...'\n# Local\nexport LOCAL_NIM_BASE_URL='http://localhost:8000/v1'\n```\n\nBEAST will route, govern, and fall back across providers according to the fitness map. Providers you haven't configured are skipped cleanly.\n\n```\n# Gateway health\nGET  /health\nGET  /edgek/state\n\n# BEAST Cockpit (live ops dashboard)\nGET  /ui\n\n# Inference (drop-in replacements)\nPOST /v1/chat/completions          # OpenAI-compatible\nPOST /v1/messages                  # Anthropic-compatible\nPOST /hf/v1/chat/completions       # HuggingFace router\nPOST /litellm/v1/chat/completions  # LiteLLM proxy\n\n# Context and workspace\nPOST /edgek/tools/intercept        # Semantic tool-call interception\nGET  /edgek/workspace              # Workspace graph state\nPOST /edgek/workspace/index        # Index a repository\n\n# Budget and runtime\nGET  /edgek/runtime/state\nGET  /edgek/runtime/attempts\nPOST /edgek/runtime/circuit-breakers/{provider}/reset\n\n# MCP broker\nPOST /edgek/mcp/evaluate\nPOST /edgek/mcp/execute\nGET  /edgek/mcp/audit\n\n# Skills and promotion\nGET  /edgek/skills/promotion-candidates\nPOST /edgek/skills/promote\n\n# Enterprise\nPOST /edgek/enterprise/teams\nPOST /edgek/enterprise/virtual-keys\nGET  /edgek/enterprise/observability\n```\n\nFull endpoint reference in the [API docs](/Byron2306/EdgeK-BEAST/blob/main/docs/api.md).\n\n`policies/default.yaml`\n\ncontrols everything:\n\n- Spend caps and token budgets per provider and per team\n- Shell command allowlists and blocklists\n- File path write restrictions\n- MCP server trust levels\n- Circuit breaker thresholds\n- Tool laziness learning parameters\n\n```\n# Deterministic benchmark (no API calls needed)\nPYTHONPATH=. python3 benchmarks/run_benchmark.py --lanes all --tasks 10\n\n# Live provider benchmark\nPYTHONPATH=. python3 benchmarks/run_live_benchmark.py --providers hf,openrouter,cohere\n\n# Provider edge compare (cloud vs local NIM)\nPYTHONPATH=. python3 benchmarks/provider_edge_compare.py --repeats 3\n```\n\nResults are written to `benchmarks/results/`\n\n.\n\nBEAST generates LiteLLM and Nginx configs directly from your active policy:\n\n```\nPYTHONPATH=. python3 scripts/generate_deploy_configs.py --out deploy/generated\n```\n\nNginx routes `/tool-calls/*`\n\ninto BEAST's semantic interceptor — file read requests return the top 3 relevant snippets instead of full source files.\n\nSee [deployment_integrations.md](/Byron2306/EdgeK-BEAST/blob/main/docs/deployment_integrations.md) for the full runbook including GitHub tool calls, Postgres integration, and prompt-cache keepalive setup.\n\n- It does not replace your LLM provider. It governs the traffic between your agent and your provider.\n- It does not add latency you'll notice for most tasks. Output governance adds microseconds locally; provider latency dominates.\n- It does not require a GPU. The entire governance and compilation pipeline runs on CPU.\n- It does not phone home. Everything — workspace graph, budget ledger, forensic archive, skill tree — is local SQLite and append-only files.\n\nMIT — see [LICENSE](/Byron2306/EdgeK-BEAST/blob/main/LICENSE).\n\nActive development. Core governance pipeline (input economy + output contracts + local verification) is stable and benchmarked. V2 roadmap focuses on the Chronicle engine, route cards, and skill promotion loop. See [BEAST_V2_ROADMAP.md](/Byron2306/EdgeK-BEAST/blob/main/docs/BEAST_V2_ROADMAP.md).\n\nContributions, issues, and provider benchmark results welcome.", "url": "https://wpnews.pro/news/beast-governed-output-gateway-for-ai-coding-agents", "canonical_source": "https://github.com/Byron2306/EdgeK-BEAST", "published_at": "2026-06-19 10:43:41+00:00", "updated_at": "2026-06-19 11:08:03.012028+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-safety", "large-language-models", "developer-tools"], "entities": ["Beast", "Cursor", "Claude Code", "VS Code Copilot", "DeepSeek", "NVIDIA NIM", "DeepInfra", "LLM7"], "alternates": {"html": "https://wpnews.pro/news/beast-governed-output-gateway-for-ai-coding-agents", "markdown": "https://wpnews.pro/news/beast-governed-output-gateway-for-ai-coding-agents.md", "text": "https://wpnews.pro/news/beast-governed-output-gateway-for-ai-coding-agents.txt", "jsonld": "https://wpnews.pro/news/beast-governed-output-gateway-for-ai-coding-agents.jsonld"}}