Claude Code and Codex are logging your token usage locally. Here is how to read it.

wpnews.pro

cd /news/developer-tools/claude-code-and-codex-are-logging-yo… · home › topics › developer-tools › article

[ARTICLE · art-31979] src=dev.to ↗ pub=2026-06-18T02:40Z topic=developer-tools verified=true sentiment=· neutral

Claude Code and Codex are logging your token usage locally. Here is how to read it.

Claude Code and Codex store detailed token usage logs locally, including prompt cache hit rates, which developers can analyze without API calls. A developer created ModelMeter, a tool that reads these logs and displays cache efficiency metrics in a dashboard.

read3 min views30 publishedJun 18, 2026

Your AI coding agent's token data is already on your machine. You just haven't looked at it yet.

Claude Code and Codex both write local logs after every session. Those logs include detailed token breakdowns: uncached input, cache hits, cache writes, output. No API call needed, no provider dashboard, no guessing. The number that matters most, your prompt cache hit rate, has been sitting on your disk every time you wondered why you were burning through your weekly limit so fast.

Claude Code writes a JSONL transcript for every session under ~/.claude/projects/

. Each assistant message carries a usage

block:

{
  "type": "assistant",
  "uuid": "f0c8...",
  "message": {
    "model": "claude-opus-4-...",
    "usage": {
      "input_tokens": 137,
      "cache_read_input_tokens": 815193,
      "cache_creation_input_tokens": 5521,
      "output_tokens": 4260
    }
  }
}

That split is the whole game. input_tokens

is uncached input. cache_read_input_tokens

is context served from the prompt cache. cache_creation_input_tokens

is context written to cache. output_tokens

is the response. On a long agent session, cache_read

should dwarf input_tokens

. If it does not, you are re-paying for the same context on every single turn.

Codex writes rollouts under ~/.codex/sessions/

. It emits token_count

events with a cumulative running total per session:

{ "type": "token_count", "info": { "total_token_usage": {
  "input_tokens": 0, "cached_input_tokens": 0,
  "output_tokens": 0, "reasoning_output_tokens": 0
}}}

Because Codex counts are cumulative, you take the delta between events rather than summing them.

A few lines of Node walk the JSONL, sum usage per model per day, and dedupe by message uuid

for Claude and by session delta for Codex, so you never double-count:

import { readFileSync } from 'node:fs'

for (const line of readFileSync(file, 'utf8').split('\n')) {
  if (!line.trim()) continue
  const o = JSON.parse(line)
  const u = o.message?.usage
  if (!u || seen.has(o.uuid)) continue
  seen.add(o.uuid)
  // accumulate u.input_tokens, u.cache_read_input_tokens,
  // u.cache_creation_input_tokens, u.output_tokens by o.message.model
}

Notice what you do not need: the prompt text, the response text, or any API key. Model names and token counts are enough to compute everything useful. A usage tool should never have to read what you typed, and this one does not.

Once you aggregate, one metric matters more than the rest: your prompt cache hit rate.

hit_rate = cache_read / (cache_read + cache_creation + uncached_input)

On a flat plan, this is your real efficiency lever. A high hit rate means you are reusing context instead of resending it. A low one means you are burning tokens, and your usage limit, on the same context over and over. The fix is usually structural: stabilize the front of your prompt so the cache prefix stays intact, keep tool definitions lean, and stop reshuffling system context between turns.

One honest caveat: on a subscription you do not pay per token, so any dollar figure is an API list-price equivalent, not your actual cost. It is a useful sense of scale, nothing more. The signals that genuinely matter are token volume and cache hit rate. Any tool that flashes a "you spent $X this month" number at a flat-plan user is being a little loose with what that number means.

I wrapped all of this into ModelMeter. A one-line collector reads those local logs and sends the token counts, and only the token counts, to a dashboard that shows your cache hit rate, ranks where your tokens are going, and labels every figure by how it was derived: computed from real tokens, a gated estimate, or "coming" when it needs request-level data the logs do not contain.

npx modelmeter-collect init <your-token>
npx modelmeter-collect

Add a Claude Code Stop

hook or a 60-second cron job and it stays live, updating after each prompt. It works for Claude Code, Codex, or both. It also accepts usage from a metered API key via a copy-paste snippet, or from a CSV export if you would rather not run the collector at all.

Free to try at modelmeter.dev.

Whether or not you use ModelMeter: you are not flying blind. Your subscription coding tool has been writing detailed usage data to your local disk after every session. Go read it. You will almost certainly find that your biggest efficiency lever is a single number you have never once looked at.

source & further reading

dev.to — original article Cadence Over Volume — Orchestrating Multiple Projects with AI Agents One API Key Across OpenAI, Claude and Gemini: Chatbot Fallback Options for SaaS Apps Claude Code hooks: why "just tell it not to" doesn't hold up

~/api · this article 200

$curl api.wpnews.pro/v1/news/claude-code-and-codex-ar…

Read original on dev.to → dev.to/newtorob/claude-code-and-codex-are-loggin…

mentioned entities

Claude Code

Codex

ModelMeter

Anthropic

OpenAI

metadata

slugclaude-code-and-codex-are-logging-your-token-usage-locally-here-is-how-to-read

topic#developer-tools

secondary2 topics

sentimentneutral

canonicaldev.to

navigation

← prevNeed Suggestions for Scaling AI-…

next →David Holz is taking Midjourney …

── more in #developer-tools 4 stories · sorted by recency

dev.to · 2 Aug · #developer-tools

Claude Code hooks: why "just tell it not to" doesn't hold up

github.com · 2 Aug · #developer-tools

Gauge – see where your Claude Code subscription goes

dev.to · 2 Aug · #developer-tools

One API key across OpenAI, Claude and Gemini: how to compare token cost per model

dev.to · 2 Aug · #developer-tools

One API Key Across OpenAI, Claude and Gemini: Chatbot Fallback Options for SaaS Apps

── more on @claude code 3 stories trending now

wpnews · 2 Aug · #artificial-intelligence

I Ran 8 AI APIs Through the Same 50 Prompts — Here's the Real Cost Breakdown

wpnews · 2 Aug · #developer-tools

Agent-Browser – Browser Automation for AI

wpnews · 2 Aug · #artificial-intelligence

DeepSeek V4 Flash Outperforms Fable 5 On Terminal Bench While Being 99% Cheaper

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required