MCP vs Skills: Why Skills Save Context Tokens

A developer argues that skills, not MCP (Model Context Protocol), are the better default for saving context tokens in AI agent setups. Skills use progressive disclosure, loading detailed instructions only when needed, while MCP servers dump full tool catalogs into every session, wasting context. The developer recommends using skills for operational knowledge and treating MCP as optional rather than foundational.

MCP is useful, but most of the time you do not actually need it. It gives an agent a clean way to discover tools, call APIs, and work with external systems. In practice, a skill file can describe the same usage path without dragging the whole MCP surface into context. But MCP is not free; rather than MCP itself, the real issue is the habit of loading a big MCP surface into every session, no matter what the session is actually about. Once a Claude Code or Codex run pulls in a bunch of servers, the model sees those tool definitions right away, even if the job is just writing docs or fixing a small bug. That is where the waste starts. Every MCP server brings metadata with it: tool names, descriptions, argument schemas, nested parameters, enums, examples, and sometimes prompts or resources. While useful, this is still context. If you connect a handful of lightweight tools, the overhead is annoying but manageable. If you connect a real stack of services, the cost compounds fast. In practice, you end up paying for: That last point matters more than people think. Context acts as the active working set the model uses to reason. The more of it you burn on static tool catalogs, the less room you have for the user request, the repo state, prior reasoning, and the actual answer. Anthropic has already written about this problem directly in the context of MCP. Their engineering post on code execution with MCP calls out tool-definition bloat and shows how direct tool calls can consume a lot of context before the model even starts doing the real job. The tool list is not just setup noise; it is part of the session cost. Skills take a different path. A skill file keeps the always-loaded portion tiny. Usually that means just the skill name and a short description in the frontmatter. The detailed instructions stay in SKILL.md and only load when the model actually needs them. This progressive disclosure is the whole trick: For repeated operational knowledge, that is a much better tradeoff than dumping a full MCP tool surface into every session. You get the guidance when it matters, and you do not spend tokens on it when it does not. This is why skills are a better default for: They are not trying to be live integrations. They are trying to be cheap, reusable context. Skills are for instructions, decision-making, and the actual usage pattern, while MCP is usually just extra protocol surface. In practice, that means skills can replace MCP for the part humans actually interact with. The model does not need a full tool catalog in context just to know how to use a service. If the agent needs to use a database, hit a SaaS API, or make authenticated requests in real time, the skill can still describe the flow clearly and keep the model on the narrow path it needs. If the agent just needs to know how your team wants it to behave, a skill is the better shape. Most of the time, that is the whole job. The mistake is to keep a heavy protocol layer around when a skill file can do the same job with far less context. Use skills by default. Treat MCP as optional, not foundational. That sounds obvious, but a lot of agent setups blur the line. They stuff every possible tool into every session, then wonder why the model gets slower, more expensive, and harder to steer. If you have a service that exposes 40 or 50 MCP tools, it might be fine for a developer who uses it every day. But most sessions do not need all 50 tools. A lot of the time, the agent just needs one narrow procedure, such as looking up a user, updating a record, creating a ticket, or formatting a request safely. The skill can tell the model exactly how to handle the task, what fields matter, what not to do, and which edge cases to watch for. The model does not need a giant always-on MCP tool catalog to do that well. That is the real token saving. You stop paying for the full runtime surface when all you needed was the operating playbook. If you have an MCP server that mostly behaves like a reusable API wrapper, you should turn the useful parts into a skill. The easiest way to inspect what you actually need is to use MCPViewer tool https://mcpview.teamcopilot.ai . Here is the workflow: SKILL.md file as the skill’s content reference. How to use APIs for <service name service .This flow extracts the useful service knowledge into a lighter, reusable skill that the model can load only when needed, rather than trying to preserve every tool forever. If the service changes often, keep the skill narrow and update it when the API changes. If the service is stable, the skill becomes a better long-term home for the instructions than the full MCP surface. For most teams, the best setup is skills everywhere, using skill files for the things that must be remembered: If a service still needs live execution, the skill can describe that path without dragging its whole protocol surface into every session. This keeps the agent lean and makes the system easier to maintain, because procedural knowledge is no longer spread across a large tool registry. It is also easier to reason about failure. If the skill is wrong, you update instructions. If you need to change how a service is used, you update the skill. Those are different jobs, and it helps to keep them separate. The problem is not just token cost in the billing sense. It is context waste. Every extra tool definition you stuff into a session is one more thing the model has to carry around while solving the actual task. Skills let you defer that cost until the model really needs the information. They are a good fit for repeated workflows, company knowledge, and reusable operating rules. If MCP is the transport, skills are the memory. MCP is not the main problem. The problem is loading it into sessions that do not need it when a skill file would do the job with far less context. Yes, for most practical cases. If the goal is to teach the agent how to use a service, a skill can replace MCP and keep the context much smaller. Because the always-loaded part is small, the model sees the skill name and description first, then loads the full SKILL.md only when the skill is relevant. Reusable instructions, procedures, checklists, formatting rules, and team-specific guidance. If the content is mostly about how to behave, it belongs in a skill. Very little, unless you have a special case, as the same operational knowledge usually fits better in a skill. Yes. That is often the best setup. MCP handles the runtime connection. The skill handles the playbook for using it well. Because it lets you inspect the actual MCP surface before you decide what should stay as MCP and what should become a lighter skill. That makes the conversion less guessy. Update the skill the same way you would update any other documentation or wrapper. If the API changes often, keep the skill narrow so maintenance stays easy. Something specific and boring, such as How to use APIs for <service name service . This pattern tells the model exactly what the skill is for without wasting words.