{"slug": "observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something", "title": "Observability told me exactly how much money my agents wasted. I wanted something that says no.", "summary": "A developer built Gatewards, an open-source proxy that enforces per-agent API spending caps in the request path without taking custody of API keys. The tool addresses the gap between observability and cost control, offering features like per-agent rate limits, cross-agent response deduplication, and automatic pipeline pausing on spend spikes. It is available at gatewards.com with an Apache-2.0 licensed SDK.", "body_md": "Most AI cost tooling is an autopsy. It tells you, in detail, what you already spent — token counts, per-call traces, a\n\ndashboard that turns red after the bill is locked in. None of it does the one thing I kept wanting: refuse the call before\n\nit goes out.\n\nI ran into this building agent tooling. Once I had more than a couple of agents hitting paid APIs on a schedule, two\n\nproblems showed up that nothing off the shelf solved cleanly.\n\nProblem 1: observability is not control\n\nWatching spend and stopping spend are different systems, and every tool I tried lived on the watching side. I could\n\nreconstruct, after the fact, that agent 4 had a bad night. What I couldn't do was tell agent 4 \"you're done for today\"\n\nwithout a hard limit that fires before the request leaves.\n\nThe closest thing providers offer is per-key budgeting. That sounds right until you run more than one agent. Keys get\n\nshared, and the moment three agents share an API key a per-key cap can't tell them apart — you've lost the unit that\n\nactually matters, which is the agent.\n\nSo the cap I wanted was specific:\n\nProblem 2: I didn't want to hand over my keys\n\nPlenty of \"AI gateway\" products will do governance for you — by becoming the thing that holds your API keys and signs\n\nrequests on your behalf. For a fleet that touches real money, handing custody of credentials to a third party is a hard no.\n\nI wanted enforcement without custody: keep my own keys, let something in front of the fleet enforce the rules.\n\nWhat I ended up building\n\nCouldn't find a drop-in that did per-agent, request-path enforcement without taking custody, so I built one. It's a proxy\n\nyou point agents at. They keep their own keys. No rewrite, no framework lock-in — LangChain, CrewAI, or a raw script all\n\ntalk to the same proxy.\n\nThe integration is boring on purpose:\n\n`import { createPaymentClient } from \"@gatewards/agent-sdk\";`\n\n`const client = createPaymentClient({`\n\napiKey: process.env.GATEWARDS_AGENT_KEY, // identifies THIS agent\n\nproxy: true,\n\n});\n\n`// your agent's calls go through the proxy unchanged`\n\nconst res = await client.get(\"https://api.example.com/data\");\n\nYou set the cap per agent (calls/day + max per call). When an agent goes over, the proxy returns a refusal in the request\n\npath — your call gets a 429, not a silent overage you discover tomorrow. When an agent's rate spikes into loop territory,\n\nthe pipeline auto-pauses instead of grinding through your budget.\n\nBecause every call is already tagged by agent identity, attribution stops being a grep session. You get \"which agent spent\n\nwhat\" for free, as a side effect of the thing that enforces the caps.\n\nThe one that surprised me: cross-agent dedup\n\nThis one I didn't plan for. Several agents poll the same endpoints — same GET, same params, different agents. The proxy\n\ncaches identical GET responses across the whole fleet, so five agents making the same call pay for one. On a polling-heavy\n\nfleet that turned out to be a bigger line-item win than the caps.\n\nWhat it deliberately doesn't do\n\nHonesty matters more than a clean pitch, so the limits up front:\n\nWhere it is\n\nIt's live at [gatewards.com](https://gatewards.com/), and the SDK is open source (Apache-2.0): **npm i @gatewards/agent-sdk**\n\nIf you're running a fleet and fighting the same thing, I'd genuinely like to compare notes — especially on the cap-primitive\n\nquestion. Is calls/day + max-per-call enough, or does the lack of a dollar cap break it for you? Tell me where this falls\n\nshort.", "url": "https://wpnews.pro/news/observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something", "canonical_source": "https://dev.to/rtahabas/observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something-that-says-no-4176", "published_at": "2026-06-22 07:49:25+00:00", "updated_at": "2026-06-22 08:09:40.428140+00:00", "lang": "en", "topics": ["ai-infrastructure", "developer-tools", "ai-agents", "mlops"], "entities": ["Gatewards", "LangChain", "CrewAI", "Apache-2.0"], "alternates": {"html": "https://wpnews.pro/news/observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something", "markdown": "https://wpnews.pro/news/observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something.md", "text": "https://wpnews.pro/news/observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something.txt", "jsonld": "https://wpnews.pro/news/observability-told-me-exactly-how-much-money-my-agents-wasted-i-wanted-something.jsonld"}}