A few weeks ago I woke up to a $100 charge from my AI provider.
For a lot of people that's nothing. For me, a solo dev who obsessively keeps infrastructure costs near zero, it genuinely stung. But that wasn't even the part that got me.
The part that got me was what I found when I went looking for answers.
Reddit threads. Developer forums. People waking up to $10,000. $30,000. $80,000 bills.
Three root causes, over and over:
That last one is what really broke my brain. The alerts are just dashboards with email attachments. They don't stop anything. You still get the bill.
I built Loopers- a reverse proxy that sits between your application and your LLM provider and enforces a hard dollar cap.
Not a soft alert. A kill switch.
curl http://localhost:8080/openai/v1/chat/completions \
-H "Authorization: Bearer lp-your-key" \
-H "X-Loopers-Provider-Key: sk-your-openai-key" \
-d '{"model": "gpt-4o-mini", "messages": [...]}'
If your budget is hit, the request dies right there. The provider is never called. No tokens burned. No bill.
The hard part isn't blocking pre-call requests. That's easy. The hard part is streaming.
With SSE streaming, the provider is already sending you tokens by the time you realize cost is climbing. So Loopers intercepts the stream in real-time, counts tokens chunk-by-chunk, and severs the connection the moment cost crosses the reservation.
And when a client disconnects mid-generation (dropped connection, timeout, whatever), Loopers captures the exact token count generated up to that millisecond and refunds the remainder of the reservation back to Redis. No phantom charges.
The budget enforcement itself runs through Redis Lua scripts- single atomic transaction, no TOCTOU race conditions, even under heavy concurrent load.
The concurrent Lua atomicity holds up in tests (100 goroutines, same key), but I'd genuinely love a second pair of eyes on the scripts from anyone who's done serious Redis work.
And the streaming reconciliation pattern, I'm curious if others have solved mid-stream token accounting differently.
go run github.com/loopers-oss/loopers/cmd/loopers init
docker-compose up -d
β github.com/CURSED-ME/loopers-oss
This is my first major Go project. I'd love brutal, honest feedback on the architecture, the code, the README clarity, anything. Drop it in the comments.
The core is fully MIT. I'm building a managed cloud version to fund continued OSS work but nothing is held back from the community repo.