Helicone is in maintenance mode. So I built the lightweight alternative I wanted.

wpnews.pro

cd /news/ai-tools/helicone-is-in-maintenance-mode-so-i… · home › topics › ai-tools › article

[ARTICLE · art-25279] src=dev.to ↗ pub=2026-06-12T14:23Z topic=ai-tools verified=true sentiment=↑ positive

Helicone is in maintenance mode. So I built the lightweight alternative I wanted.

A developer built TokenWatch, an open-source tool for tracking LLM costs, after Helicone entered maintenance mode following its acquisition by Mintlify in March. The lightweight alternative uses SQLite and requires no proxy in the request path, offering a budget kill-switch that throws an exception when spending limits are reached. TokenWatch provides cost attribution by feature and customer as its default view, addressing the developer's frustration with incumbent tools that require complex self-hosting setups or fail to stop runaway agent loops.

read2 min views24 publishedJun 12, 2026

If you were using Helicone to track your LLM costs, you've probably seen the news: after the Mintlify acquisition in March, it's officially in maintenance mode. Feature development has stopped. 16,000+ organizations are quietly looking around. Langfuse — the other indie-friendly option — was acquired by ClickHouse in January, and self-hosting it means running ClickHouse + Postgres + Redis + S3. To look at your own API bill.

Meanwhile the problem is getting worse, not better. We're all running agents now, and agents have a special talent: an uncapped recursive loop can turn a $4k/month budget into an $11.2k bill in three weeks (real story). The provider dashboards tell you what you spent. Not where, not which feature, not which customer.

So I built TokenWatch — the tool I wanted as a solo AI builder:

npx tokenwatch-sdk serve   # dashboard on localhost:4318. That's the whole setup.
js
import { wrapAnthropic, init } from 'tokenwatch-sdk';

const claude = wrapAnthropic(new Anthropic(), { feature: 'summarize', customerId: 'acme' });
init({ enforceBudget: true });

Every call — streaming included — is now tracked: model, tokens, cost, latency, errors, attributed to features and customers.

Design decisions (a.k.a. my complaints about the incumbents)

No proxy in your request path. Your calls go straight to OpenAI/Anthropic; telemetry ships async on the side. A monitoring tool should never be the reason your product is down.

One process, SQLite, zero native deps. It uses Node's built-in node:sqlite. No Docker compose with four services. Your usage data stays on your machine.

A budget kill-switch, not just a budget chart. Set a monthly budget: at 80% your webhook fires, at 100% wrapped calls throw BudgetExceededError instead of spending more. Watching a dashboard doesn't stop an agent loop at 3am — an exception does.

Margin attribution, not traces. Tracing UIs are built for debugging. Most of the time I have a simpler question: which feature is losing money and which customer is profitable? Cost by feature and by customer is the default view, not a saved query.

Python SDK with literally zero dependencies. Standard library only. wrap_openai(client) and you're done.

What it's not

It's not a tracing platform, it's not an eval suite, and it won't replace Langfuse for a 50-person team that lives in traces. It's the 80% tool for the solo builder and small team: where is the money going, is quality degrading, and stop the bleeding automatically.

It's MIT-licensed and v0.1 — built in public, partly with AI agents (Claude Code wrote a lot of it, which felt appropriately recursive for a tool that monitors AI spend). Feedback, issues, and brutal honesty welcome.

GitHub: https://github.com/jkhusanovpn/tokenwatch

What's your current setup for tracking LLM costs — and has an agent ever surprised you with a bill?

source & further reading

dev.to — original article AgentENV: Distributed Runtime for AI Agents at Scale (Open Source, Rust) I Made REGENT: An MCP Server for Configuring OpenWrt Routers Through an AI Physics-Augmented Diffusion Modeling for satellite anomaly response operations with embodied agent feedback loops

~/api · this article 200

$curl api.wpnews.pro/v1/news/helicone-is-in-maintenan…

Read original on dev.to → dev.to/jkhusanov/helicone-is-in-maintenance-mode…

mentioned entities

Helicone

Mintlify

Langfuse

ClickHouse

TokenWatch

OpenAI

Anthropic

metadata

slughelicone-is-in-maintenance-mode-so-i-built-the-lightweight-alternative-i-wanted

topic#ai-tools

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevA Data Scientist's DeepSeek + Ne…

next →London Tech Week day five: A wee…

── more in #ai-tools 4 stories · sorted by recency

clickhouse.com · 27 Jul · #ai-tools

ClickHouse docs relaunch

pub.towardsai.net · 26 Jul · #ai-tools

I Self-Hosted Langfuse so My LLM Traces Would Stop Living On Someone Else’s Bill

byteiota.com · 22 Jul · #ai-tools

OpenObserve Hits 20K Stars: Full Observability for $3/Day

corecticai.com · 21 Jul · #ai-tools

CoreCtic AI – Turn your AI API costs into a profit center

── more on @helicone 3 stories trending now

wpnews · 26 Jul · #artificial-intelligence

Nobel laureate Simon Johnson on the AI race and China’s ‘over-automation’ problem

wpnews · 26 Jul · #artificial-intelligence

China’s Moonshot, Z.AI, and DeepSeek are challenging U.S. AI labs—and beating them on cost

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required