MCP Is Dead

wpnews.pro

cd /news/large-language-models/mcp-is-dead · home › topics › large-language-models › article

[ARTICLE · art-18186] src=quandri.io ↗ pub=2026-05-29T22:56Z topic=large-language-models verified=true sentiment=↓ negative

MCP Is Dead

A backend engineer at Quandri has declared MCP "dead" after measuring that connecting four MCP servers consumes 10.5% of a model's context window on tool definitions alone, with Linear's server alone occupying over 12,800 tokens for 42 tools. The engineer found MCP was 3x slower per call than direct API usage and consumed roughly 65x more tokens to look up the same Linear issue, arguing that existing CLI tools and API calls are more efficient and reliable. The post recommends replacing MCP with "Skills" that embed CLI usage instructions, loading only the commands needed into context rather than carrying all tool definitions at all times.

read4 min views23 publishedMay 29, 2026

Backend Engineer @ Quandri

TL;DR: MCP eats context, has low reliability, and overlaps with existing CLI/API.

💡

Reference: MCP is dead. Long live the CLI

After reading the above article, we ran the experiments on our actual stack. This document covers the original argument, additional research, and our measurements.

📌

Update: Since these measurements were taken, Claude Code has rolled out Tool Search with Deferred , which loads MCP tool schemas on-demand and reduces context usage by 85%+. The context bloat described in Problem 1 is largely addressed for users on current Claude Code versions. The performance, debugging, and architectural arguments below still apply.

MCP (Model Context Protocol) connects LLMs to external tools (GitHub, Linear, Notion, Slack, etc.).

Since its launch in late 2024, it's been called "the USB-C of the AI ecosystem." But developers actually using it day-to-day are starting to think differently.

TL;DR: MCP eats context, has low reliability, and overlaps with existing CLI/API.

The context window is the LLM's desk. When you connect MCP servers, tool definitions alone take up a significant chunk of that desk.

Restaurant analogy:

We extracted and measured the actual tool definitions from the MCP servers connected in our environment. With all 4 servers connected, 10.5% of the context window is consumed by tool definitions alone.

Linear alone accounts for over 12,800 tokens. That's 42 tool definitions always loaded, even if you only ever use get_issue

and save_issue

Performance is a known issue. The author of the original article benchmarked Jira MCP against its REST API directly and found MCP was 3x slower per call, and 9.4x slower on first call including initialization. This isn't Jira-specific, it's architectural: every MCP server adds a process layer between the LLM and the underlying API. The same overhead applies to the Linear, Notion, and Slack servers in our stack.

How many tokens does it cost to look up the same Linear issue? MCP consumes ~65x more tokens than the CLI approach.

[ CLI approach: ~200 tokens ]
curl -s -H "Authorization: Bearer $LINEAR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query":"{ issue(id: \"ISSUE-ID\") { title state { name } assignee { name } } }"}' \
  https://api.linear.app/graphql

-> Prompt (curl command): ~50 tokens
-> Response: ~150 tokens

[ MCP approach: ~12,957 tokens ]
-> Tool definitions (always loaded): ~12,807 tokens (42 tools)
-> Tool call + response: ~150 tokens

Provide CLI -> API -> docs, in that order. LLMs already learned from man pages and StackOverflow.

Using existing CLI directly:

If MCP is "spreading all menus on the table upfront", Skills is "asking the librarian for only the book you need".

The key is embedding CLI usage instructions inside Skills. Combined with Alternative 1's CLI-first strategy, this is most efficient. For example, a Linear skill:

- Linear API: https://api.linear.app/graphql
- Auth: Bearer Token ($LINEAR_TOKEN env var)
- Get issue: curl -s -H "Authorization: Bearer $LINEAR_TOKEN" -H "Content-Type: application/json" -d '{"query":"{ issue(id: \"ISSUE-ID\") { title state { name } assignee { name } } }"}' https://api.linear.app/graphql
- Search issues (GraphQL): adjust the query field for JQL-like filtering
- Results are JSON, parse with jq

This way, the LLM only loads the above into context when the skill is invoked. No need to carry 42 tool definitions at all times. Just the CLI commands it needs.

Not entirely. MCP is still valid when:

Short answer: it depends.

DBs are just query execution at the end of the day. LLMs already know SQL and MongoDB queries well. Put DB info and CLI usage in a skill, and it works fine without MCP. Just give it the schema and it writes the queries.

- Host: postgres://localhost:5432/myapp
- Tables: users (id, name, email), orders (id, user_id, status)
- CLI: psql -h localhost -d myapp -c "SELECT * FROM users WHERE ..."

However, MCP has advantages for databases:

DROP TABLE

But for most developer workflows, MCP is over-engineering.

These days, every SaaS landing page has "MCP supported" in the feature list. Whether the MCP server is stable or how much context it eats doesn't matter - the goal is checking the "we do MCP too" box. Same pattern as "AI-powered" and "blockchain-based" marketing from years past. When users actually connect, they get dozens of tool definitions loaded, initialization failures, and mid-session crashes.

At Quandri we use all three approaches side by side, picking what fits each service:

gh

, psql

, aws

). Zero context cost, full flexibility, debugs straight in the terminal.We don't force one path. If a CLI already exists and authenticates locally, that's usually the lightest option. If a service has no CLI or we need uniform auth across the team, MCP earns its keep.

Teaching well matters more than connecting everything.

For us, replacing MCP servers with Skills that wrap existing CLIs freed up ~21K tokens of context, removed init failures from our daily workflow, and kept debugging in the terminal where it belongs.

Load only the tools you need, only when you need them, with CLI instructions baked in. MCP might evolve to solve these problems, but right now, Skills win.****

Measurement methodology:

source & further reading

quandri.io — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/mcp-is-dead

Read original on quandri.io → www.quandri.io/engineering-blog/mcp-is-dead

mentioned entities

MCP

Claude Code

Quandri

GitHub

Linear

Notion

Slack

metadata

slugmcp-is-dead

topic#large-language-models

secondary3 topics

sentimentnegative

canonicalquandri.io

navigation

← prevDuckDuckGo's 'No AI' Search Traf…

next →5 CLAUDE.md Patterns That Make A…

── more in #large-language-models 4 stories · sorted by recency

infoq.com · 14 Jul · #large-language-models

Google and Industry Partners Announce Agentic Resource Discovery Specification for AI Agents

github.com · 14 Jul · #large-language-models

Show HN: Kmux – Parallel terminal workspace optimized for AI coding agents

pub.towardsai.net · 14 Jul · #large-language-models

From Prompt to Production: The Spec-Driven Workflow I Use With Claude Code

dev.to · 14 Jul · #large-language-models

The Right Way to Start Claude Code on an AWS Project

── more on @mcp 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required