# Do MCP's use more tokens than CLI's?

> Source: <https://nmm.ee/token-usage-mcp-vs-cli>
> Published: 2026-06-21 07:30:51+00:00

## Do MCP’s use more tokens than CLI’s?

There’s a long-held belief out there that MCP’s are bad, and that you should actually use CLI’s instead, if you want to save on token spend. To me that argument has never really made any sense, because you still have to provide the LLM context so it knows what CLI to run, how to run it and how to interpret the result of it. Things that the MCP does for you. How come the CLI takes fewer tokens?

Sure, there’s probably *some* communication overhead, but I can’t see how it would be something significant enough to entirely forgo the benefits of a standardized communication protocol that are MCP’s and to instead favor the wild west of self-made CLI integrations, which are more costly to create and to maintain, because *you* have to make those, whereas MCP’s are ready-made for LLM integration.

### Talk is cheap: experiment time

Instead of talking in hypotheticals, let’s run an actual experiment, shall we? At work I help build the [CodeScene CodeHealth MCP](https://github.com/codescene-oss/codescene-mcp-server), and it has a tool called `code_health_review`

. Simply put, it takes a file and analyses its code health.

What it does is not all that important, but what is important is that this MCP tool is just a thin wrapper over the CodeScene CLI’s `cs review {file}`

call, which makes it ideal to use for our experiment.

I will be using the exact same context for the CLI test as is in the MCP’s tool description, that way the context the LLM has is identical in knowing what input it needs to give and what output it can expect, and the only difference we should see is what we’re actually interested in - the communication overhead difference between the MCP and the CLI.

For this test I’ve created a [little script](https://gist.github.com/askonomm/5c8c0ec519afd8ec43fd1fc5036655e3) that uses Claude Sonnet 4.6.

**The MCP approach** is to run the `code_health_review`

tool with full description + JSON schema `input_schema`

in the `tools`

array, have the LLM emit a `tool_use`

block with `{"file_path": "/path/to/repo/calculator.py"}`

, return the tool result and have the LLM summarize.

**The CLI approach** is to use the equivalent instructions of the MCP in the `system`

prompt (how to run `cs review`

, what it returns, score interpretation), have a generic `bash`

tool in `tools`

array for command execution, have the LLM emit a `tool_use`

block with `{"command": "cs review --output-format=json /path/to/repo/calculator.py"}`

, return the tool result and have the LLM summarize.

### Experiment result: no difference

Based on this little experiment it seems my gut feeling was right - there’s no significant difference between the MCP and the CLI when it comes to direct tool calling.

**The MCP used 5 fewer input tokens** on the initial request (900 vs 905). It seems Anthropic’s internal tool schema representation is approximately the same cost as equivalent free-form text in a system prompt.

**The per-call overhead is negligible.** The assistant’s tool call + the tool result being appended to the context took 2 additional tokens over the CLI (93 for the MCP vs 91 for the CLI).

The MCP tool definition (name + description + JSON schema) seems to tokenize to roughly the same as equivalent system prompt instructions + a bash tool definition. The JSON schema structure of the MCP (`type`

, `properties`

, `required`

) is offset by the CLI approach needing both the instructional text AND a generic bash tool definition.

The LLM summary output token spend is inconclusive because it differs on each run. Sometimes the CLI takes more tokens, sometimes the MCP does. Generally the difference here doesn’t seem to exceed more than ~70 tokens, and this is irrelevant to our experiment anyway as the summary is based on the prompt given, which in our case is identical.

The only real argument that could be made against the MCP’s in my eyes is that MCP’s will load all their tools to the context window at all times, whereas with the CLI you can either pick and choose what you add to global context, or use SKILL’s, which can be deferred and thus loaded on-demand.

Except that argument also falls flat on its face, because [MCP tools are also deferred](https://www.anthropic.com/engineering/advanced-tool-use). By now most mainstream clients use some sort of MCP tool deferral mechanism. Thus, I conclude that the CLI supremacy claim had *some* merit pre-2026 in that the MCP tools were eager-loaded into global context, but by now they are pretty much the same in token spend.

I’d give an extra point to MCP’s simply because manual CLI integrations are inconvenient and time-consuming, so with all things otherwise being equal, why would I bother with that?
