cd /news/ai-infrastructure/observability-told-me-exactly-how-mu… · home topics ai-infrastructure article
[ARTICLE · art-36263] src=dev.to ↗ pub= topic=ai-infrastructure verified=true sentiment=↑ positive

Observability told me exactly how much money my agents wasted. I wanted something that says no.

A developer built Gatewards, an open-source proxy that enforces per-agent API spending caps in the request path without taking custody of API keys. The tool addresses the gap between observability and cost control, offering features like per-agent rate limits, cross-agent response deduplication, and automatic pipeline pausing on spend spikes. It is available at gatewards.com with an Apache-2.0 licensed SDK.

read3 min views1 publishedJun 22, 2026

Most AI cost tooling is an autopsy. It tells you, in detail, what you already spent — token counts, per-call traces, a

dashboard that turns red after the bill is locked in. None of it does the one thing I kept wanting: refuse the call before

it goes out.

I ran into this building agent tooling. Once I had more than a couple of agents hitting paid APIs on a schedule, two

problems showed up that nothing off the shelf solved cleanly.

Problem 1: observability is not control

Watching spend and stopping spend are different systems, and every tool I tried lived on the watching side. I could

reconstruct, after the fact, that agent 4 had a bad night. What I couldn't do was tell agent 4 "you're done for today"

without a hard limit that fires before the request leaves.

The closest thing providers offer is per-key budgeting. That sounds right until you run more than one agent. Keys get

shared, and the moment three agents share an API key a per-key cap can't tell them apart — you've lost the unit that

actually matters, which is the agent.

So the cap I wanted was specific:

Problem 2: I didn't want to hand over my keys

Plenty of "AI gateway" products will do governance for you — by becoming the thing that holds your API keys and signs

requests on your behalf. For a fleet that touches real money, handing custody of credentials to a third party is a hard no.

I wanted enforcement without custody: keep my own keys, let something in front of the fleet enforce the rules.

What I ended up building

Couldn't find a drop-in that did per-agent, request-path enforcement without taking custody, so I built one. It's a proxy

you point agents at. They keep their own keys. No rewrite, no framework lock-in — LangChain, CrewAI, or a raw script all

talk to the same proxy.

The integration is boring on purpose:

`import { createPaymentClient } from "@gatewards/agent-sdk";`

`const client = createPaymentClient({`

apiKey: process.env.GATEWARDS_AGENT_KEY, // identifies THIS agent

proxy: true,

}); // your agent's calls go through the proxy unchanged

const res = await client.get("https://api.example.com/data"); You set the cap per agent (calls/day + max per call). When an agent goes over, the proxy returns a refusal in the request

path — your call gets a 429, not a silent overage you discover tomorrow. When an agent's rate spikes into loop territory,

the pipeline auto-s instead of grinding through your budget.

Because every call is already tagged by agent identity, attribution stops being a grep session. You get "which agent spent

what" for free, as a side effect of the thing that enforces the caps.

The one that surprised me: cross-agent dedup

This one I didn't plan for. Several agents poll the same endpoints — same GET, same params, different agents. The proxy

caches identical GET responses across the whole fleet, so five agents making the same call pay for one. On a polling-heavy

fleet that turned out to be a bigger line-item win than the caps.

What it deliberately doesn't do

Honesty matters more than a clean pitch, so the limits up front:

Where it is

It's live at [gatewards.com](https://gatewards.com/), and the SDK is open source (Apache-2.0): **npm i @gatewards/agent-sdk**

If you're running a fleet and fighting the same thing, I'd genuinely like to compare notes — especially on the cap-primitive

question. Is calls/day + max-per-call enough, or does the lack of a dollar cap break it for you? Tell me where this falls

short.

── more in #ai-infrastructure 4 stories · sorted by recency
── more on @gatewards 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/observability-told-m…] indexed:0 read:3min 2026-06-22 ·