You use Claude Code, or ChatGPT, or both, every day. Quick question: how many messages did you send last month? Which model ate most of your budget? How much did prompt caching actually save you?
You don't know. I didn't either.
That's a weird gap. We instrument everything else β git activity, deploy frequency, test coverage β but the tool we now spend the most hours inside is a black box. The vendor dashboard, if it exists, is a billing page, not a mirror.
So I built four tiny tools to fix that for myself. They all run 100% locally. No accounts, no API keys, no telemetry, no network calls. They read files that are already on your disk and print something you can look at. All four are open source on github.com/greymoth-jp β and because that's a real claim, the only thing I'll ask is that you grep the source yourself before you trust me.
Here's the privacy point up front, because it's the whole design: these read your data, but your data never leaves your machine. That's not a feature I'm bolting on for a marketing line. It's the reason the tools are small enough to audit in one sitting.
Before the tools, here's what I assumed: my Claude Code bill is dominated by the prompts I write, so to spend less I should write tighter prompts. Compress the context. Trim the system message. The usual advice.
I ran the numbers on my own ~/.claude
transcripts and got this:
| component | share of cost |
|---|---|
| cacheRead | 72% |
| cacheWrite | ~19% |
| output | the rest |
| input | |
| ~0.3% |
Input β the thing everyone tells you to compress β was 0.3% of my spend. Compressing my prompts to save money would've been optimizing the rounding error. Worse: compressing a static prompt changes its bytes, which busts the prefix cache, which can make the bill go up.
The real cost center was cache reads: long sessions dragging a fat context forward, turn after turn. That points at completely different levers β cache hygiene (milestone /compact
, /clear
before the context balloons, keeping CLAUDE.md
static so it doesn't bust the cache), and routing a whole mechanical session to a cheaper tier at the boundary, never mid-session.
Important honesty caveat: that 72% is my number, from my usage, and the dollar figures are estimates against published rates. Yours will be different. If you're on a Max/Pro plan it's "value extracted," not literal spend. The point isn't the specific percentage β it's that you can't reason about a cost you've never measured. Measure first, then optimize. The tool below does the measuring.
All open source, all local, all MIT-licensed. Two are npm CLIs; one is a browser extension; one reads public data only.
tokenops
β Claude Code cost truth
The one that produced the table above. It reads ~/.claude
, breaks your spend down by component and by model, and then gives you data-validated advice β not generic tips, but actions ranked by the dollars your profile says are on the table.
npx @greymoth/tokenops demo # synthetic data β try it with zero risk first
npm i -g @greymoth/tokenops
tokenops report # cost by component + by model
tokenops advise # prioritized, $-quantified actions
tokenops card # a shareable BeforeβAfter card (--anon hides project names)
β ** github.com/greymoth-jp/tokenops** Β· npm:
@greymoth/tokenops
Start with tokenops demo
β it runs on synthetic data so you can see exactly what it does before pointing it at your own transcripts.
ccwrapped
β your Claude Code "Wrapped" card
Same ~/.claude
data, different job. Where tokenops
is the spreadsheet, ccwrapped
is the poster: messages, estimated value, top model, top project, and how much caching saved you β rendered as a self-contained SVG you can screenshot and share.
npm i -g @greymoth/ccwrapped
ccwrapped --wrapped # writes an SVG β open in a browser, screenshot, share
ccwrapped --wrapped --anon # same, with project names hidden for a clean public share
β ** github.com/greymoth-jp/ccwrapped** Β· npm:
@greymoth/ccwrapped
My own card said ~194,379 messages and a prompt-caching figure in the six figures of estimated equivalent value. Again β that's my year, not a benchmark. The fun part is that yours is a surprise even to you.
inkdex
β ChatGPT + Claude usage, in the browser
Not everyone lives in a terminal. inkdex
is a Manifest V3 browser extension that tracks your ChatGPT and Claude web usage locally and prints a risograph Wrapped card. No account, and nothing leaves your browser β it's all in extension storage.
β github.com/greymoth-jp/inkdex
ghwrapped
β any public GitHub profile β a shareable card The odd one out: it doesn't read your private data because it doesn't need to. Feed it any public GitHub username and it renders a risograph Wrapped card from public data only. Good for a year-in-review, a profile README, or sizing up a repo you're about to depend on.
β github.com/greymoth-jp/ghwrapped
I keep saying these send nothing. You shouldn't take that on faith from a stranger on the internet β that's the entire point of shipping the source. So:
bin/
in grep -rEi 'fetch|http|net\.|request|axios' .
in the cloned repo. If a usage analyzer is opening a socket, you'll see it.That's the difference between trust me and verify me, and it's the only kind of privacy claim worth making.
Here's the actual ask, and it's the fun one:
npm i -g @greymoth/ccwrapped
ccwrapped --wrapped --anon
(the --anon
hides your project names, so it's safe to post)I genuinely want to see the spread. My caching savings looked absurd; maybe yours dwarf mine, maybe you barely cache at all and that itself is the finding. Either way you'll learn something about a tool you use every day and have never actually looked at.
And if you find a bug, or a place where one of these does touch the network when it shouldn't β open an issue. Catching that is the best possible outcome, because it proves the "grep it yourself" model works.
Repos, one more time, all under one roof: ** github.com/greymoth-jp** β
ccwrapped
, tokenops
, inkdex
, ghwrapped
.Go measure the thing you can't see. Then come back and show me the card.