Save 60-90% of Your Claude Code Tokens With Two Tools

wpnews.pro

TL;DR:Two tools cut Claude Code token usage at two different layers. RTK is a shell proxy that compresses command output before it ever reaches the context window. context-mode is a Claude Code plugin that does heavy tool work in a sandbox and hands back only the answer. They stack cleanly on top of each other, and a single skill installs both. This article explains how each one works and how to wire them in.

Two commands into a session, my context window was already a third full, and I hadn't written a line of code yet. A pnpm install

had dumped its entire dependency tree, a git log

paid out two hundred commits, then a stack trace landed in full. None of that was work I'd asked for - it just sat there in the context window eating tokens on every turn.

Most of the token budget goes on that boring output - the installs, the logs, the traces - which piles up and gets re-read on every single turn, never on the clever reasoning you actually wanted. Two tools attack that pile from two directions. Here's how they work, and how to install both in one command.

This is the last article in the series, and it builds on the skill pattern from the third. You can pass this article URL straight to Claude Code and follow along.

Picture the context window as a desk. Everything Claude needs stays on the desk so it can glance at it: your prompts, its replies and the output of every command it ran. The desk has a size limit, and once something is on it, it gets re-read on every turn until it falls off the edge.

Two kinds of clutter land there:

The two tools map onto those two problems. RTK trims the output before it ever reaches the desk. context-mode keeps the heaviest work off the desk altogether.

RTK is a shell-level proxy. It sits between Claude and the commands it runs, intercepts the output and compresses it before it reaches the context window. The claim is 60 to 90 percent savings on typical dev operations, and it ships a rtk gain

command so you can check your real number instead of taking the claim on faith.

The mechanism is a hook. After you set up the Claude Code integration, every Bash command Claude runs gets transparently rewritten to route through RTK. git status

becomes rtk git status

behind the scenes, with no change to how you or Claude write commands and no token overhead for the rewrite itself.

Install is two moves: drop in the binary, then wire the integration.

rtk init -g

Then confirm it's the right tool and it's working:

rtk --version
rtk gain

One trap worth naming: there's a second, unrelated project that also ships a binary called rtk

(a Rust type toolkit). If rtk gain

comes back "command not found" after a clean install, you almost certainly have the other one. Check with which rtk

and grab the token-killer from its own repo.

context-mode comes at the problem from the other side. It's a Claude Code plugin, MCP server plus hooks, and instead of trimming output it relocates the heavy work.

When Claude needs to process something large - a big log, or a sprawling JSON payload - context-mode runs that work in a sandbox and returns only the derived answer. The raw bytes never enter the context window. You asked how many errors are in a 10,000-line log, you get back "47" and the breakdown, not the log. The desk stays clear because the bulky part of the job happened in the magic drawer.

Because it registers an MCP server and hooks, context-mode only activates after a full restart of Claude Code. Installing it is the usual two plugin commands:

/plugin marketplace add mksglu/context-mode
/plugin install context-mode@context-mode

Restart, then verify it came up:

/context-mode:ctx-doctor

Installing two tools by hand, across platforms, with steps that drift as the projects evolve, is exactly the kind of chore Article 3 argued you should automate. So this is a skill too.

The catch is that install steps go stale. Hardcode them today and the article rots the moment either project changes a flag. So the skill doesn't hardcode anything. It fetches the current README for each tool at run time and follows whatever the install section says right now.

---
name: setup-token-savings
description: Install RTK and context-mode to cut token usage
disable-model-invocation: true
---

## Instructions

### Step 1 - Fetch latest install instructions

Use WebFetch to read the current README for each tool, and follow its
Installation section to check the steps below:

- RTK: https://github.com/rtk-ai/rtk
- context-mode: https://github.com/mksglu/context-mode

### Step 2 - Install RTK

Follow the RTK README for the current platform (detect with `uname -s` /
`uname -r`), then run `rtk init -g`. Verify with `rtk --version` and `rtk gain`.

### Step 3 - Install context-mode

Follow the context-mode README. Typically:
claude plugin marketplace add mksglu/context-mode
claude plugin install context-mode@context-mode

### Final step

Tell the user to fully restart Claude Code, since context-mode's MCP server and
hooks only activate on restart. Then suggest /context-mode:ctx-doctor to verify.

Drop that into skills/setup-token-savings/SKILL.md

, and a fresh machine gets both tools with one command:

/your-plugin:setup-token-savings

A slightly more verbose version lives in the reference repo, in the base plugin's skills: github.com/Nagell/claude-marketplace.

The point of rtk gain

is that you don't have to trust the headline number. It reports your actual savings from real usage:

rtk gain            # total savings so far
rtk gain --history  # per-command breakdown
RTK Token Savings (Global Scope)
════════════════════════════════════════════════════════════
Total commands:    1539
Input tokens:      1.3M
Output tokens:     257.6K
Tokens saved:      1.0M (80.0%)
Total exec time:   57m58s (avg 2.3s)
Efficiency meter: ███████████████████░░░░░ 80.0%

By Command
────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────
 1.  rtk find                     95  284.0K   62.5%   33.2s  ██████████
 2.  rtk lint eslint               4  211.9K   99.5%    8.0s  ███████░░░
 3.  rtk curl -s https://r...      1  185.3K   99.9%   277ms  ███████░░░
 4.  rtk read                    135  154.0K   13.1%     0ms  █████░░░░░
 5.  rtk git diff HEAD -- ...      1   42.3K   85.1%    17ms  █░░░░░░░░░
 6.  rtk grep                    217   32.8K   18.6%    79ms  █░░░░░░░░░
 7.  rtk ls                      203   24.0K   62.1%     2ms  █░░░░░░░░░
 8.  rtk curl -s -X POST h...      2   11.1K   96.5%   653ms  ░░░░░░░░░░
 9.  rtk git log --all --o...      1    9.5K   94.7%    10ms  ░░░░░░░░░░
10.  rtk curl -fsSL https:...      1    7.2K   96.5%   387ms  ░░░░░░░░░░
────────────────────────────────────────────────────────────────────────

rtk discover

goes one step further and scans your Claude Code history for commands that would benefit from routing through RTK but aren't yet, so you can widen the net over time.

That's the whole setup. Your marketplace holds the plugins, your safety hooks catch the dangerous commands, one command installs everything on a new machine, and two more keep the token bill down. None of these pieces is big on its own. But wired together in a marketplace you own, the next new laptop costs you a couple of commands instead of a lost afternoon. If you've been following along, that starter template is the place to put it all: github.com/Nagell/claude-marketplace-template. If you missed any of the steps, check out other articles in the series.

source & further reading

dev.to — original article Data, Context & RAG Lineage Governance for Enterprise AI Agents AI Consent Ledger: Stop Voice Agents From Ignoring Revoked Permission How to Build Profitable Mobile Apps as a Python Dev

Save 60-90% of Your Claude Code Tokens With Two Tools

Run your AI side-project on zahid.host