# How I let AI agents pay for APIs per call (the HTTP 402 path)

> Source: <https://dev.to/lemoncake/how-i-let-ai-agents-pay-for-apis-per-call-the-http-402-path-cp>
> Published: 2026-06-17 18:07:56+00:00

If you've built an MCP server or any API that costs you money to run (an LLM call, a paid data source, compute), you've probably hit the same wall I did:

How do you get paid per call — when the caller is an AI agent, not a human with a credit card form?

A human can sign up, enter a card, get an API key. An autonomous agent can't fill out a Stripe checkout form mid-task. And you don't want to hand an agent a raw API key with no spending limit — one runaway loop and your bill explodes.

This post walks through the design I landed on. It's not the only way, but the pieces are reusable even if you build your own.

The core idea: HTTP 402

402 Payment Required has been a reserved HTTP status code since the beginning, basically unused. It turns out to be exactly the primitive we need.

The flow:

An agent calls your endpoint with no payment.

You respond 402 with a small JSON body describing how to pay (price, where to top up, what token format you accept).

The agent (or its owner) tops up once, getting a budget-capped token.

The agent retries with the token in the Authorization header. Now it works — and keeps working until the budget runs out, then it gets 402 again.

HTTP/1.1 402 Payment Required

Content-Type: application/json

{

"accepts": [

{

"scheme": "lemoncake-pay-token",

"price": "0.01",

"currency": "USD",

"mintUrl": "https://.../buy/",

"gatewayUrl": "https://.../g/"

}

]

}

This is the shape the x402 spec standardizes. You don't strictly need the spec to do it — but following it means agent frameworks that already understand 402 can pay you without custom glue.

The budget cap is the important part

The naive version — "give the agent an API key" — is dangerous because there's no ceiling. The whole point of an agent paying autonomously is that you stop watching it. So the token has to carry its own limits:

{

"budget": 5.00,

"spent": 0.06,

"max_calls": 50,

"calls_used": 6,

"expires_at": "2026-07-01T00:00:00Z"

}

The gateway checks these on every call before forwarding upstream. Budget exhausted → 402. Rate limit hit → 429. Expired → 402. The agent can never spend more than the token allows, even if it goes haywire.

I encode the token as a signed JWT (HS256) so the gateway can verify it without a DB round-trip on the hot path, then check the live spend counter in Postgres. The JWT carries the token id, endpoint id, and owner id; the mutable budget lives in the DB.

The gateway pattern

The key architectural move: a proxy in front of the real endpoint. The agent never calls your upstream directly. It calls a gateway URL like /g/. The gateway:

Verifies the pay token.

Checks budget / rate limit / expiry.

Forwards the request to your real upstream (with your upstream auth attached server-side, so the agent never sees your real keys).

Records the call + cost in a ledger.

Returns the upstream response.

agent ──► /g/ (gateway)

├─ verify token

├─ check budget

├─ forward ──► your real API (with your secret key)

├─ record usage + cost

└─ return response

This decouples two things that are usually tangled: who can call (the agent's pay token) and how you authenticate upstream (your secret, never exposed). It also means you can put a per-endpoint price on any existing API without touching its code.

Settling the money

Minting a budget-capped token means someone paid up front. I use Stripe Checkout as a Direct Charge on the provider's connected account (Stripe Connect), so the money lands in the provider's balance and the platform takes a small application fee — once, at payment time, not per call. The per-call cost is just a ledger figure that draws down the prepaid budget.

This matters because charging a fee on every tiny call would get eaten by Stripe's per-transaction minimums. Prepaid bundle + ledger drawdown sidesteps that entirely.

What I'd tell you before you build your own

A few things that bit me:

Don't put the fee on each call. Stripe's minimum charge makes sub-cent per-call billing impossible. Prepay a budget, draw it down in your own ledger.

The token must be re-displayable. Agents lose context. The buyer needs a way to recover the same token (I key the success page off the Stripe session id, which is single-use and unguessable).

Scope the token to one endpoint. A token minted for endpoint A should be rejected at endpoint B. Otherwise a leaked token is a blank check.

Forward upstream auth server-side only. The agent should never be able to read your real upstream key. The gateway attaches it after auth.

The result

I packaged this up as a project called LemonCake — you wrap any API/MCP endpoint, set a price per call, and get a gateway URL an agent can pay through autonomously. There's a live demo (no signup) if you want to see the 402 → top-up → call loop run end to end: [https://www.lemoncake.xyz/demo](https://www.lemoncake.xyz/demo)

But honestly, even if you never touch it, the pattern stands on its own:

402 to advertise the price → budget-capped token → gateway that verifies, forwards, and meters.

I'm still genuinely unsure whether autonomous per-call payment is something agent builders need today or whether I'm a year or two early. If you've hit the "how do I charge an agent" problem from the other side, I'd love to hear how you solved it.
