cd /news/developer-tools/token-efficiency-in-claude-code Β· home β€Ί topics β€Ί developer-tools β€Ί article
[ARTICLE Β· art-28992] src=dev.to β†— pub= topic=developer-tools verified=true sentiment=↑ positive

Token Efficiency in Claude Code

SectorFlow Engineering reduced Claude Code context overhead by 60% by replacing a monolithic 400-line CLAUDE.md file with task-specific skill files. The team also cut token waste by having engineers paste acceptance criteria instead of using a Linear MCP, and by using grep and node --check instead of reading entire files or running visual checks for non-visual changes.

read4 min views1 publishedJun 16, 2026

SectorFlow Engineering Series Β· Part 1 of 3 Β· Parent article

Notes on where our context budget was actually going, and what we did about it.

June 2026 Β· SectorFlow Engineering

In this series[Part 2: The Skills File Pattern]β€” fixing CLAUDE.md bloat with imports.[Part 3: Picking Models and Tools]β€” the MCPs we tried, refused, and why.

Claude Code can do a lot. The catch is that all of it runs on context, and you don't get much of that.

When we started SectorFlow we did the obvious thing. Kept a CLAUDE.md

at the repo root, and every time something went wrong β€” wrong model string, a cache TTL that didn't match, a chart that came out looking off β€” we'd write a rule and stick it on the end. The file kept growing. We didn't really clock it as a problem until it was one.

By about week six it was 400 lines. Every session loaded the whole thing. Frontend rules sitting next to deployment runbooks sitting next to database decisions, none of it sorted. And because we'd added the rules one at a time over weeks, some of them flatly disagreed with each other. Claude would follow the new one, or the old one, or try to split the difference. We got something wrong either way.

I want to be clear this isn't a Claude Code problem. It's on us, and it's fixable. But fixing it meant we had to stop treating CLAUDE.md

like a junk drawer.

The thing that actually hurts isn't the per-token price. It's that every token spent context is a token you don't get back for the work. Burn 30,000 on setup and you've got far less room to write code than if you'd burned 5,000. You hit the ceiling partway through a file and whatever you were in the middle of is just gone.

Here's the shift, and it's simple once you see it: anything Claude reads at the start of a session is something it can't use later for code. Most projects pile everything into CLAUDE.md

on the theory that the model might need it someday. We flipped the question. What does the model need for this task? Load that. Skip the rest.

Two rules came out of it:

Those turned into three actual practices, and each one gets its own article in this series. This one is just the overview β€” what we measured and why it matters.

Every session loads CLAUDE.md

plus whatever it imports. Before, that was the one 400-line file, every time, regardless of the task. After we split it into separate skill files, a UI task pulls in core.md

(the constraints) and design.md (the visual stuff) and nothing else. An infra task gets core.md

and infrastructure.md

. Startup cost dropped about 60%.

We had the Linear MCP hooked up so Claude could read tickets itself. Nice in theory. But one list_issues

call runs about 3,500 tokens, and the whole read-it / mark-done / comment loop is around 9,000. So now the engineer just pastes the acceptance criteria. That's maybe 400 tokens. The 8,600 difference doesn't sound like much until you multiply it across 60-plus tickets β€” that's something like 7 or 8 full context windows handed back to the actual work.

Left alone, Claude reads files to get its bearings, sometimes three or four of them before it writes a line. So we made a rule: only read files the task names. Need to find a function? grep

for it, then view just those lines. Don't open a file to soak up "context." If something's actually missing, ask. Saves 2,000–4,000 tokens on a complex task. Verifying a change by eye means starting the server, waiting, navigating, screenshotting, evaluating β€” a whole chain of calls. For anything you can't see in a browser, like server logic or data contracts or route handlers, that chain tells you nothing. So we only do the visual check when the change is something a person could actually see in a browser. For syntax we run node --check

. One Bash call.

Source of overhead Before After Saving
Session context load ~400 lines, every session 60–120 lines, task-specific ~60%
Ticket ingestion (per ticket) ~9,000 tokens via MCP ~400 tokens via paste ~8,600 tokens
File reads per task 3–5 files speculatively Named files only 2,000–4,000 tokens
Verification overhead Dev server + screenshot
node --check only
4–6 tool calls

Each of these on its own is fine, nothing dramatic. Put together they change what fits in a session. Stuff that used to take two or three sessions now usually fits in one. That's the whole point.

The other two articles each take one piece of this:

Read this one first for the why. Then either of the others for the how.

Claude Code does its best work when the context is small, accurate, and honest about what's actually known versus what you're hoping for. Vague in, vague out. And a context file that tries to cover everything ends up covering nothing properly.

── more in #developer-tools 4 stories Β· sorted by recency
── more on @sectorflow 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/token-efficiency-in-…] indexed:0 read:4min 2026-06-16 Β· β€”