{"slug": "i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-here-s-the", "title": "I ran a single Claude Code session for 1,270 turns. It cost $1,278. Here's the breakdown.", "summary": "A single Claude Code session logged approximately 1,270 model turns and cost $1,278, with two-thirds of that bill — roughly $843 — going to re-sending context the model had already seen on every turn. The developer found that only 14% of the cost ($179) went to the model actually writing code, while the rest was consumed by re-transmitting the entire accumulated conversation history, even at a discounted cache-read rate of 0.1× the normal input price. The session's peak context reached nearly 998,000 tokens, and the cost structure reveals that in long sessions, the expense of re-sending context dominates because it is paid on every turn against a growing context size.", "body_md": "n=1 note.This is the anatomy ofone real session, not an average or benchmark. The specific numbers are this session's actual measured figures. The mechanic — re-sent context dominating long sessions — is general. Your numbers will differ with your workflow. The full dataset (percentile tables, methodology, charts) for n=66 sessions lives at the[benchmark page]; this article is about the one session that broke my mental model.\n\nThere's a session in my logs that cost $1,278.\n\nNot $12.78. Not a typo. $1,278, across approximately 1,270 model turns, in a single Claude Code coding session. When I measured it properly, two things became obvious:\n\nHere's the full honest breakdown.\n\n| Metric | Value |\n|---|---|\n| Total session cost | ~$1,278 |\n| Model turns | ~1,270 |\n| Cost per turn (average) | ~$1.01 |\n| Peak context (tokens) | ~998,000 |\n| Cache efficiency | ~98% |\n\nThe cost-per-turn number already tells you something is off. $1.01 per back-and-forth exchange sounds like the model is doing tremendous amounts of work on every turn. It wasn't. The session was long debugging and build work — not dramatically different from any other session in kind, only in *length*.\n\n| Line item | Share of cost |\n|---|---|\n| Re-sent context (cache-read) |\n66% (~$843) |\n| New context written to cache (cache-write) | 20% (~$256) |\n| Output — the model actually writing | 14% (~$179) |\n| Fresh (uncached) input | ~0% |\n\n**The model writing code was 14% of the bill.** Two-thirds of the session's cost was paying to re-send context the model had *already seen*, on every single turn, over and over.\n\nThat was my mental model failure. I'd thought of each turn as: *ask → model thinks → model writes → charge*. The actual cost structure is more like: *ask → re-send the entire conversation → model writes → charge for all of it*. And the re-sending part, even at the discounted cache-read rate, dominates in a long session.\n\nClaude is stateless. It has no memory between turns. So on every single turn, the client sends the entire accumulated conversation — every prior message, every file you've read, every tool output — to give the model its context. This is the architectural reality, not a Claude quirk.\n\nPrompt caching softens the blow: if the server has seen that exact prefix before, the re-send is billed at the cache-read rate, which is roughly **0.1× the normal input price**. That's the \"98% cache efficiency\" number — almost all the re-sent tokens hit the cache and got the cheap rate.\n\nHere's the problem: **cheap per token, but paid on every turn, on the entire context**.\n\nThe cost of re-sending context on a single turn is roughly:\n\n```\ncache_read_cost ≈ context_size × input_price × 0.1\n```\n\nAnd the *session* cost of re-sending is:\n\n```\ntotal_cache_read ≈ context_size × turns × input_price × 0.1\n```\n\nIn this session: ~998,000 peak context tokens × ~1,270 turns × input price × 0.1. Even at 10% of the input rate, that's a huge number. A 98% cache hit rate means you're efficiently paying a small amount — on an enormous volume, repeatedly. The efficiency sounds impressive until you realize the denominator is \"the whole accumulated history of a 20-hour session.\"\n\nMeanwhile, the model's *output* — the code it actually writes, the explanations it gives — was ~14% of the bill. Output is priced *higher* per token than input. But the model writes far fewer tokens per turn than the context it re-reads. The output is the productive work; it's just not the biggest cost center.\n\nContext doesn't grow linearly. Early in the session: small context, cheap re-sends. As the session continues:\n\nSo you're multiplying a growing per-turn cost by a growing turn count. That's why cost in a long session doesn't grow proportionally — it accelerates. By the time context peaks near 998k tokens, every single turn is a substantial re-send. There were turns in this session where the re-send cost alone was more than the cost of the model's entire response.\n\n**A 98% cache hit rate is not the same as \"caching solved this.\"** High cache efficiency means you're efficiently re-sending context at the cheap rate. It says nothing about *how much* you're re-sending. You can have near-perfect efficiency and still be burning money, because you're re-sending an enormous context a thousand times. Cache efficiency is a per-token metric; the bill is a product of that rate × total tokens re-sent × turns.\n\n**Output optimization is the wrong lever.** If you want to reduce a long session's cost, targeting output tokens gets you to 14% of the bill at most. Everything below that line — model-side improvements, prompt compression tricks, fewer words in each response — doesn't move the cost much. The 66% (re-sent context) and 20% (cache-write) are where the money is.\n\n**The fix is mundane.** The answer isn't a clever optimization — it's just keeping the context small:\n\n`/compact`\n\nearlier in long sessions.This is **n=1 — one real session**. The $1,278 and the 66/20/14/0 split are the actual measured numbers for this session. Your sessions will differ depending on how long you run, how large your context gets, and what you're working on.\n\nWhat generalizes is the mechanic: context size × turns drives re-sent context cost, and in any session long enough that context has grown large and turns have compounded, re-sent context will be the dominant line. The specific percentages will shift; the structure won't.\n\nThe session I measured was an extreme case — nearly a million tokens peak context, over a thousand turns. Most sessions aren't this long. In my set of 66 sessions, the median peaked at ~45k tokens and had ~29 turns; in those, re-sent context was the median session's *minority* (24% of spend). The lesson of the n=1 study and the n=66 benchmark together: **typical sessions are fine; it's the long ones you need to watch, and in those the re-sent context is the bill.**\n\nI measured this session with a small open-source CLI that reads Claude Code's local logs:\n\n```\nnpx @wartzar-bee/tokenscope\n```\n\nIt runs locally — **read-only, nothing uploaded, no telemetry** — and shows the same breakdown: output vs. cache-read vs. cache-write, the per-turn context-growth curve, and which percentile your sessions land in against the 66-session reference set. `--share`\n\nemits a privacy-safe summary card (aggregate numbers only, no content or file paths) you can paste into a thread.\n\nThe full study — with the hand-coded SVG cost-split chart, context-growth curve, complete data table, and methodology — is at: [https://tokenscope.pages.dev/study/](https://tokenscope.pages.dev/study/)\n\n(Disclosure: I maintain tokenscope. I'm linking it because it's the tool that measured the session, not because you need it — Claude Code's JSONL logs are in `~/.claude/projects/`\n\nand the math is straightforward once you know to look at `cache_read_input_tokens`\n\n.)\n\nThe $1,278 session wasn't a bug or an accident. It was a long session working on a complex project with a large, growing context. Every mechanism that drove its cost is documented, predictable, and — once you know it — partly avoidable. The model writing code was 14% of it. The rest was infrastructure: paying to re-send, again and again, the memory the model doesn't have.", "url": "https://wpnews.pro/news/i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-here-s-the", "canonical_source": "https://dev.to/wartzarbee/i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-heres-the-breakdown-554c", "published_at": "2026-05-31 10:44:57+00:00", "updated_at": "2026-05-31 11:12:39.223252+00:00", "lang": "en", "topics": ["large-language-models", "ai-tools", "ai-infrastructure", "ai-research", "ai-products"], "entities": ["Claude Code", "Anthropic"], "alternates": {"html": "https://wpnews.pro/news/i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-here-s-the", "markdown": "https://wpnews.pro/news/i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-here-s-the.md", "text": "https://wpnews.pro/news/i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-here-s-the.txt", "jsonld": "https://wpnews.pro/news/i-ran-a-single-claude-code-session-for-1270-turns-it-cost-1278-here-s-the.jsonld"}}