Cloudflare AI Gateway now supports spend limits

wpnews.pro

cd /news/ai-infrastructure/cloudflare-ai-gateway-now-supports-s… · home › topics › ai-infrastructure › article

[ARTICLE · art-22289] src=developers.cloudflare.com ↗ pub=2026-06-05T07:37Z topic=ai-infrastructure verified=true sentiment=· neutral

Cloudflare AI Gateway now supports spend limits

Cloudflare has introduced spend limits for its AI Gateway, allowing users to set cost-based budgets that block further requests with a 429 response when cumulative spending reaches a defined threshold within a time window. The feature tracks actual dollar costs per request based on model pricing and can be scoped by model, provider, or custom metadata dimensions like user ID or team. This gives organizations real-time cost control over AI API usage, with options to block requests or automatically fall back to cheaper models when budgets are exceeded.

read3 min views16 publishedJun 5, 2026

Spend limits let you set cost-based budgets on your AI Gateway. When cumulative spend reaches the limit within a time window, AI Gateway blocks further requests with a 429

response until the window resets.

Unlike rate limiting, which caps the number of requests, spend limits track actual dollar cost per request based on model pricing. You can scope limits to any combination of model, provider, or custom metadata dimensions like user ID, team, or application.

Spend limits apply to both Unified Billing requests and BYOK requests for models with known pricing. Each spend limit rule defines a budget (in dollars) over a rolling or fixed time window. AI Gateway calculates the cost of each request based on token usage and model pricing, then tracks cumulative spend against the limit in real time.

Before sending a request to the provider, AI Gateway evaluates all applicable spend limit rules at once. If any individual rule is over budget, the request is blocked with a 429

response.

Spend limits are eventually consistent. The current request's cost is recorded after completion, so a burst of concurrent requests can briefly exceed the limit before enforcement catches up.

Each rule can be scoped by model, provider, or custom metadata dimensions. Each dimension can be configured in one of two modes:

Mode	Behavior	Example
Split by value	Each distinct value gets its own independent budget bucket.	Splitting by `metadata.user_id` gives every user their own budget.
Filter by value	The rule applies only when the dimension equals a specific value.	Filtering `metadata.team` to `engineering` limits only requests from the engineering team.

If a dimension is not configured on a rule, all values share one budget bucket. For example, a rule without a provider dimension tracks spend across all providers together.

Given a request with model=openai/gpt-5.5

and metadata.user_id=u_42

Scenario	Dimensions	Budget bucket
Global budget for everyone	None	One shared bucket
Per-user budget	`metadata.user_id` : split by value	Separate bucket per user
Per-provider, per-user	`metadata.user_id` : split by value, `provider` : split by value	Separate bucket per user+provider combination
Specific model only	`model` : filter by value `openai/gpt-5.5`	Only applies to `openai/gpt-5.5` requests
Per-user, per-model	`metadata.user_id` : split by value, `model` : split by value	Separate bucket per user+model combination

Spend limits are configured on the gateway via the dashboard or the API. You can define up to 20 rules per gateway.

To scope spend limits by custom dimensions like user ID or team, attach custom metadata to your requests.

When a spend limit is exceeded, AI Gateway returns a 429 Too Many Requests

response. You have two options:

Block requests(default) - The request is rejected until the budget window resets.** Fall back to a cheaper model**- Create aDynamic Routewith a primary model and a fallback (for example,anthropic/claude-opus-4.7

with a fallback to@cf/moonshotai/kimi-k2.6

). Then set a spend limit on the primary model using this feature. When the primary model's budget is exceeded, AI Gateway automatically routes requests to the fallback model instead of blocking them.

You can track your spend per model, provider, or any custom metadata attribute on the Analytics dashboard. Use this to understand usage patterns and set informed budgets.

Cost tracking is a best-effort estimation based on token counts and model pricing. Refer to your provider's dashboard for exact billing amounts.
A maximum of 20 spend limit rules can be configured per gateway.

source & further reading

developers.cloudflare.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/cloudflare-ai-gateway-no…

Read original on developers.cloudflare.com → developers.cloudflare.com/ai-gateway/features/sp…

mentioned entities

Cloudflare

AI Gateway

metadata

slugcloudflare-ai-gateway-now-supports-spend-limits

topic#ai-infrastructure

secondary4 topics

sentimentneutral

canonicaldevelopers.cloudflare.com

navigation

← prevFood for Agile Thought #547: AI'…

next →Ask HN: Do you know any company …

── more in #ai-infrastructure 4 stories · sorted by recency

therobotreport.com · 21 Jul · #ai-infrastructure

U.K.-based Humanoid secures $152M in Series A funding

twitter.com · 22 Jul · #ai-infrastructure

Gigatoken: Fastest Tokenizer

runtimewire.com · 21 Jul · #ai-infrastructure

Cascade raises $3.5M to predict construction projects before bids open

runtimewire.com · 21 Jul · #ai-infrastructure

TwelveLabs opens research preview of Jockey, an AI agent that searches entire video libraries via Claude

── more on @cloudflare 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 26 May · #ai-agents

Think, Durable Objects, and the Real Shape of AI Applications

wpnews · 8 Jul · #ai-tools

What's the Future of Clay?

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required