How I let AI agents pay for APIs per call (the HTTP 402 path)

A developer designed a system using HTTP 402 Payment Required to let AI agents pay for API calls per use. The approach involves a gateway proxy that verifies budget-capped tokens, checks spending limits, and forwards requests to the upstream API. The tokens are signed JWTs with mutable budgets stored in a database, and payments are settled via Stripe Checkout with prepaid bundles to avoid per-call transaction fees.

If you've built an MCP server or any API that costs you money to run an LLM call, a paid data source, compute , you've probably hit the same wall I did: How do you get paid per call — when the caller is an AI agent, not a human with a credit card form? A human can sign up, enter a card, get an API key. An autonomous agent can't fill out a Stripe checkout form mid-task. And you don't want to hand an agent a raw API key with no spending limit — one runaway loop and your bill explodes. This post walks through the design I landed on. It's not the only way, but the pieces are reusable even if you build your own. The core idea: HTTP 402 402 Payment Required has been a reserved HTTP status code since the beginning, basically unused. It turns out to be exactly the primitive we need. The flow: An agent calls your endpoint with no payment. You respond 402 with a small JSON body describing how to pay price, where to top up, what token format you accept . The agent or its owner tops up once, getting a budget-capped token. The agent retries with the token in the Authorization header. Now it works — and keeps working until the budget runs out, then it gets 402 again. HTTP/1.1 402 Payment Required Content-Type: application/json { "accepts": { "scheme": "lemoncake-pay-token", "price": "0.01", "currency": "USD", "mintUrl": "https://.../buy/", "gatewayUrl": "https://.../g/" } } This is the shape the x402 spec standardizes. You don't strictly need the spec to do it — but following it means agent frameworks that already understand 402 can pay you without custom glue. The budget cap is the important part The naive version — "give the agent an API key" — is dangerous because there's no ceiling. The whole point of an agent paying autonomously is that you stop watching it. So the token has to carry its own limits: { "budget": 5.00, "spent": 0.06, "max calls": 50, "calls used": 6, "expires at": "2026-07-01T00:00:00Z" } The gateway checks these on every call before forwarding upstream. Budget exhausted → 402. Rate limit hit → 429. Expired → 402. The agent can never spend more than the token allows, even if it goes haywire. I encode the token as a signed JWT HS256 so the gateway can verify it without a DB round-trip on the hot path, then check the live spend counter in Postgres. The JWT carries the token id, endpoint id, and owner id; the mutable budget lives in the DB. The gateway pattern The key architectural move: a proxy in front of the real endpoint. The agent never calls your upstream directly. It calls a gateway URL like /g/. The gateway: Verifies the pay token. Checks budget / rate limit / expiry. Forwards the request to your real upstream with your upstream auth attached server-side, so the agent never sees your real keys . Records the call + cost in a ledger. Returns the upstream response. agent ──► /g/ gateway ├─ verify token ├─ check budget ├─ forward ──► your real API with your secret key ├─ record usage + cost └─ return response This decouples two things that are usually tangled: who can call the agent's pay token and how you authenticate upstream your secret, never exposed . It also means you can put a per-endpoint price on any existing API without touching its code. Settling the money Minting a budget-capped token means someone paid up front. I use Stripe Checkout as a Direct Charge on the provider's connected account Stripe Connect , so the money lands in the provider's balance and the platform takes a small application fee — once, at payment time, not per call. The per-call cost is just a ledger figure that draws down the prepaid budget. This matters because charging a fee on every tiny call would get eaten by Stripe's per-transaction minimums. Prepaid bundle + ledger drawdown sidesteps that entirely. What I'd tell you before you build your own A few things that bit me: Don't put the fee on each call. Stripe's minimum charge makes sub-cent per-call billing impossible. Prepay a budget, draw it down in your own ledger. The token must be re-displayable. Agents lose context. The buyer needs a way to recover the same token I key the success page off the Stripe session id, which is single-use and unguessable . Scope the token to one endpoint. A token minted for endpoint A should be rejected at endpoint B. Otherwise a leaked token is a blank check. Forward upstream auth server-side only. The agent should never be able to read your real upstream key. The gateway attaches it after auth. The result I packaged this up as a project called LemonCake — you wrap any API/MCP endpoint, set a price per call, and get a gateway URL an agent can pay through autonomously. There's a live demo no signup if you want to see the 402 → top-up → call loop run end to end: https://www.lemoncake.xyz/demo https://www.lemoncake.xyz/demo But honestly, even if you never touch it, the pattern stands on its own: 402 to advertise the price → budget-capped token → gateway that verifies, forwards, and meters. I'm still genuinely unsure whether autonomous per-call payment is something agent builders need today or whether I'm a year or two early. If you've hit the "how do I charge an agent" problem from the other side, I'd love to hear how you solved it.