{"slug": "show-hn-claude-code-s-200-plan-is-a-17x-subsidy-on-the-raw-api", "title": "Show HN: Claude Code's $200 plan is a 17× subsidy on the raw API", "summary": "A developer reverse-engineered their own Claude Code logs to reveal that 64% of the $3,371 monthly bill at Opus 4.7 list rates goes to re-reading the same context every turn, not generating new content. Hidden reasoning tokens account for 84% of output costs and 60% of all re-read data, with caching already serving 98% of input yet failing to eliminate the re-read overhead. The analysis shows Claude Code's $200 plan represents a 17× subsidy on raw API costs, as the same workload without caching would cost approximately $22,630.", "body_md": "Reverse-engineer a month of your own local Claude Code logs\n(`~/.claude/projects/*/*.jsonl`\n\n) into where the **tokens, time, and cost**\nactually go — and run it on yours. Reads **only local logs**; nothing is sent anywhere.\n\nWhat it found(one month of my own logs — 181 sessions, 25,564 model calls):\n\nYou don't pay to generate, you pay to re-read.~29M unique tokens →4.35B billed (~150×), because every turn re-sends the whole ~173K-token context.- The bill is\n84% input / 16% output— and re-reading the same context is64%of it.- The biggest line is the one you never see:\nhidden reasoningis 84% of outputand~60% of everything re-read.~$3,371for the month at Opus 4.7 list rates. Caching already serves 98% of input — and re-reading isstill64% of the bill.Full write-up (all the tables, the why, the main-thread-vs-subagent split) →\n\n[coralbricks.ai/blog/claude-code-token-xray]\n\n```\npip install -r requirements.txt   # just tiktoken\npython3 token_time_breakdown.py\npython3 cost.py\npython3 main_vs_sidecar.py\npython3 reread_breakdown.py\n```\n\ntiktoken is OpenAI's tokenizer, not Claude's, so token\n\nproportionsare reliable to ~±15%, not Claude-exact. The billed-token counts in`cost.py`\n\ncome straight from the API`usage`\n\nblocks and are exact.\n\nFrom `cost.py`\n\non my logs, priced at Opus 4.7 list rates:\n\n| Line item | Cost | Share |\n|---|---|---|\n| Input — re-reading context (cache reads) | $2,176 | 64% |\n| Input — cache writes | $682 | 20% |\n| Input — fresh (uncached) | $2 | 0% |\n| Output — reasoning | $429 | 13% |\n| Output — tool calls + summaries | $82 | 2% |\nTotal |\n$3,371 |\n100% |\n\nCaching is the only thing keeping it sane — without it the same work lists at\n**~$22,630** (~7×). Your numbers will differ; that's the point. Run it on yours.\n\n— the headline table: tokens (marked input/output)`token_time_breakdown.py`\n\n**and** wall-clock time per activity (reasoning, running commands, writing tool calls, subagents, summaries, reading/searching, editing) plus the passive-context rows (system prompt + tools, attachments, the typed prompt, injected reminders). One pass, so tokens and time stay consistent. Reasoning isn't stored in plaintext (only an encrypted signature), so it's recovered by subtraction:`output − tool_calls − summaries`\n\n. Time is reconstructed from event timestamps.— billed token totals (cache reads / cache writes by TTL / fresh input / output) priced at Opus 4.7 list rates, plus the no-caching counterfactual.`cost.py`\n\n— splits the human-driven main thread from spawned subagents (logged under nested`main_vs_sidecar.py`\n\n`*/subagents/*.jsonl`\n\n); reports billed tokens, per-model mix, cache-hit rate, turns per agent (per session for the main thread, per subagent for the sidecar), and cost for each, plus the combined total.— per-activity`reread_breakdown.py`\n\n*cumulative*input: replays each session's context growth to show what each kind of context costs once it's re-read every turn. Reports`unique`\n\nvs`re-read`\n\ntokens per activity (reasoning is the biggest re-read line). The replay is scaled to the measured billed input (exact); the per-activity split is a model.\n\n- One person's month on one machine — directional, not a benchmark. Claude Code is dynamic, so your split will differ. That's the point: run it on yours.\n- A generation-time gap also includes the model reading its context before it writes; Bash time is real execution (commands auto-approved), but code run in the background or a separate terminal isn't counted.\n- The system-prompt row is estimated from each session's first cache write.\n\nIf this helped you see where your Claude Code tokens, time, and cost actually go,\nplease ⭐ [the repo](https://github.com/Coral-Bricks-AI/coral-ai) — it helps others\nfind it. Curious what your re-read share comes out to.\n\nApache 2.0 — see the repository [LICENSE](/Coral-Bricks-AI/coral-ai/blob/main/LICENSE).", "url": "https://wpnews.pro/news/show-hn-claude-code-s-200-plan-is-a-17x-subsidy-on-the-raw-api", "canonical_source": "https://github.com/Coral-Bricks-AI/coral-ai/tree/main/claude-code-token-xray", "published_at": "2026-05-27 17:25:53+00:00", "updated_at": "2026-05-27 17:46:35.581421+00:00", "lang": "en", "topics": ["large-language-models", "ai-tools", "ai-products", "ai-infrastructure", "artificial-intelligence"], "entities": ["Claude Code", "Opus 4.7", "Anthropic", "OpenAI", "tiktoken", "coralbricks.ai"], "alternates": {"html": "https://wpnews.pro/news/show-hn-claude-code-s-200-plan-is-a-17x-subsidy-on-the-raw-api", "markdown": "https://wpnews.pro/news/show-hn-claude-code-s-200-plan-is-a-17x-subsidy-on-the-raw-api.md", "text": "https://wpnews.pro/news/show-hn-claude-code-s-200-plan-is-a-17x-subsidy-on-the-raw-api.txt", "jsonld": "https://wpnews.pro/news/show-hn-claude-code-s-200-plan-is-a-17x-subsidy-on-the-raw-api.jsonld"}}