cd /news/ai-tools/i-got-a-100-ai-bill-then-i-found-the… Β· home β€Ί topics β€Ί ai-tools β€Ί article
[ARTICLE Β· art-14484] src=dev.to pub= topic=ai-tools verified=true sentiment=↑ positive

I got a $100 AI bill. Then I found the $80,000 ones. So I built a kill switch.(2026)

A solo developer built Loopers, an open-source reverse proxy that enforces hard dollar caps on AI API spending, after discovering developers were facing bills as high as $80,000 from runaway LLM costs. The tool intercepts requests between applications and providers like OpenAI, blocking calls when budgets are hit and handling streaming token accounting in real-time to sever connections mid-generation. Loopers uses Redis Lua scripts for atomic budget enforcement under concurrent load and is available on GitHub under an MIT license.

read2 min publishedMay 26, 2026

A few weeks ago I woke up to a $100 charge from my AI provider.

For a lot of people that's nothing. For me, a solo dev who obsessively keeps infrastructure costs near zero, it genuinely stung. But that wasn't even the part that got me.

The part that got me was what I found when I went looking for answers.

Reddit threads. Developer forums. People waking up to $10,000. $30,000. $80,000 bills.

Three root causes, over and over:

That last one is what really broke my brain. The alerts are just dashboards with email attachments. They don't stop anything. You still get the bill.

I built Loopers- a reverse proxy that sits between your application and your LLM provider and enforces a hard dollar cap.

Not a soft alert. A kill switch.

curl http://localhost:8080/openai/v1/chat/completions \
  -H "Authorization: Bearer lp-your-key" \
  -H "X-Loopers-Provider-Key: sk-your-openai-key" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'

If your budget is hit, the request dies right there. The provider is never called. No tokens burned. No bill.

The hard part isn't blocking pre-call requests. That's easy. The hard part is streaming.

With SSE streaming, the provider is already sending you tokens by the time you realize cost is climbing. So Loopers intercepts the stream in real-time, counts tokens chunk-by-chunk, and severs the connection the moment cost crosses the reservation.

And when a client disconnects mid-generation (dropped connection, timeout, whatever), Loopers captures the exact token count generated up to that millisecond and refunds the remainder of the reservation back to Redis. No phantom charges.

The budget enforcement itself runs through Redis Lua scripts- single atomic transaction, no TOCTOU race conditions, even under heavy concurrent load.

The concurrent Lua atomicity holds up in tests (100 goroutines, same key), but I'd genuinely love a second pair of eyes on the scripts from anyone who's done serious Redis work.

And the streaming reconciliation pattern, I'm curious if others have solved mid-stream token accounting differently.

go run github.com/loopers-oss/loopers/cmd/loopers init
docker-compose up -d

β†’ github.com/CURSED-ME/loopers-oss

This is my first major Go project. I'd love brutal, honest feedback on the architecture, the code, the README clarity, anything. Drop it in the comments.

The core is fully MIT. I'm building a managed cloud version to fund continued OSS work but nothing is held back from the community repo.

── more in #ai-tools 4 stories Β· sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/i-got-a-100-ai-bill-…] indexed:0 read:2min 2026-05-26 Β· β€”