{"slug": "inside-agent-gov-architecture-of-an-agent-cost-governance-platform", "title": "Inside agent-gov: Architecture of an Agent Cost Governance Platform", "summary": "Agent-gov is an open-source reverse proxy that intercepts every tool call made by AI agents, enforcing budgets in real time and auto-pausing out-of-control agents. Built as a FastAPI service with SQLite persistence, the proxy uses a four-stage decision tree — authentication, pause check, budget reset, and cost verification — to approve or reject calls before they reach external tools. The system runs 45 tests in 0.3 seconds and relies on a tool registry to determine actual per-call costs rather than trusting agent-reported estimates.", "body_md": "AI agents orchestrate complex workflows — calling LLMs, scraping pages, querying databases, sending emails. Each call costs real money. Without a governance layer, a single buggy loop can burn through your budget before anyone notices.\n\n**agent-gov** is an open-source reverse proxy that intercepts every tool call your agents make, enforces budgets in real time, and auto-pauses out-of-control agents. Built as a FastAPI service with SQLite persistence, running 45 tests in 0.3 seconds.\n\nThis post walks through the architecture: the proxy pattern, the four-stage decision tree, cost tracking with a tool registry, multi-tenancy via workspaces, and the lazy auto-reset pattern.\n\nEvery AI agent tool call passes through agent-gov before reaching the actual tool. The agent sends a `POST /proxy/call`\n\nwith its API key, tool name, and estimated cost. agent-gov validates, budgets, and logs — then returns a 200 to approve or a 429 to reject.\n\n```\nclass ToolCall(BaseModel):\n    agent_key: str = Field(...)\n    tool_name: str = Field(...)\n    estimated_cost: float = Field(0.0, ge=0)\n```\n\nThe proxy doesn't execute the tool itself — it guards access. The agent only proceeds if the proxy returns 200. This is the **gatekeeper pattern**: a lightweight decision layer between the agent and the outside world.\n\n``` php\nAgent -> POST /proxy/call -> agent-gov -> 200/429 -> Agent decides\n                                                      |\n                                                 Calls actual tool\n                                                      |\n                                                      v\n                                               OpenAI / Browser / API\n```\n\nWhy a proxy instead of a library? A library can be monkey-patched, removed, or forgotten. A proxy is a network boundary that agents *must* cross — it can't be bypassed.\n\nEvery proxy call runs through a four-stage pipeline:\n\n``` python\n@app.post(\"/proxy/call\")\nasync def proxy_tool_call(call: ToolCall):\n    key_hash = db.hash_key(call.agent_key)\n    agent = await db.get_agent(key_hash)\n\n    # Step 1: Auth\n    if agent is None:\n        raise HTTPException(status_code=401, detail=\"Invalid API key\")\n\n    # Step 2: Paused check\n    if agent[\"paused\"]:\n        raise HTTPException(status_code=429,\n            detail=f\"Agent '{agent['name']}' is paused.\")\n\n    # Step 3: Auto-reset budget if new day\n    agent = await db.check_and_reset_budget(agent)\n\n    # Step 4: Look up REAL tool cost\n    registered_tool = await db.get_tool(call.tool_name)\n    actual_cost = (registered_tool[\"cost_per_call\"]\n                   if registered_tool else call.estimated_cost)\n\n    # Step 5: Budget check\n    new_total = agent[\"spent_today\"] + actual_cost\n    if new_total > agent[\"daily_budget\"]:\n        await db.pause_agent(key_hash)\n        raise HTTPException(status_code=429,\n            detail=\"Budget exceeded — agent auto-paused.\")\n\n    # Step 6: Approved — update spend and log\n    updated = await db.update_agent_spend(key_hash, actual_cost)\n    await db.log_cost_event(key_hash, agent[\"name\"], call.tool_name, actual_cost)\n    return {\"status\": \"approved\", ...}\n```\n\n| Stage | Check | Exit |\n|---|---|---|\nAuth |\nDoes the API key hash match? | 401 — Invalid key |\nPause |\nIs the agent paused? | 429 — Agent paused |\nReset |\nNew day since last call? | (silent) |\nBudget |\nWould this exceed the daily cap? | 429 + auto-pause |\nLog |\nINSERT cost event | 200 — Approved |\n\nThe trickiest design decision was cost determination. Trusting the agent's `estimated_cost`\n\nis fragile — agents can under-report.\n\nagent-gov uses a **tool registry**: an UPSERT-able table of known tools with real per-call costs.\n\n```\nregistered_tool = await db.get_tool(call.tool_name)\nactual_cost = (registered_tool[\"cost_per_call\"]\n               if registered_tool else call.estimated_cost)\n```\n\nIf the tool is registered, its **true cost** is used. The response includes a `cost_source`\n\nfield so clients know which path was taken.\n\nThe test proves an agent can't lie its way past governance: an agent with a $100 budget claiming a $1 estimate for a tool registered at $500/call gets blocked with 429.\n\nv0.5 introduced workspaces — isolated tenants with their own agents, tools, and cost events. Each workspace gets a unique ID and API key. Every database row carries a `workspace_id`\n\nFK column.\n\nSchema migration uses `PRAGMA table_info`\n\nto add columns only when missing — SQLite doesn't support `IF NOT EXISTS`\n\nfor `ALTER TABLE`\n\n.\n\nTests verify workspace isolation: two workspaces, agents in each, neither can see the other's data.\n\nInstead of a midnight cron job creating a thundering herd, agent-gov uses **lazy evaluation**: every proxy call checks if a reset is needed.\n\n``` php\nasync def check_and_reset_budget(agent: dict) -> dict:\n    today = date.today().isoformat()\n    if agent[\"last_reset\"] == today:\n        return agent\n    if agent[\"paused\"]:\n        return agent\n    return await reset_daily_budget(agent[\"key_hash\"])\n```\n\nAn agent that makes no calls doesn't need a reset. The thundering herd becomes a gentle trickle.\n\nThe next evolution: per-tool budget caps, webhook-based alerts, and a management API. But the foundation — a simple, testable, async governance proxy — is solid.\n\n*agent-gov is open source and MIT licensed. 45 tests. Zero database setup.*", "url": "https://wpnews.pro/news/inside-agent-gov-architecture-of-an-agent-cost-governance-platform", "canonical_source": "https://dev.to/sschelliah/inside-agent-gov-architecture-of-an-agent-cost-governance-platform-27jl", "published_at": "2026-05-31 13:21:01+00:00", "updated_at": "2026-05-31 13:41:57.150475+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-tools", "ai-products", "mlops"], "entities": ["agent-gov", "FastAPI", "SQLite"], "alternates": {"html": "https://wpnews.pro/news/inside-agent-gov-architecture-of-an-agent-cost-governance-platform", "markdown": "https://wpnews.pro/news/inside-agent-gov-architecture-of-an-agent-cost-governance-platform.md", "text": "https://wpnews.pro/news/inside-agent-gov-architecture-of-an-agent-cost-governance-platform.txt", "jsonld": "https://wpnews.pro/news/inside-agent-gov-architecture-of-an-agent-cost-governance-platform.jsonld"}}