Build MCP Servers that don't suck...tokens.

wpnews.pro

cd /news/developer-tools/build-mcp-servers-that-don-t-suck-to… · home › topics › developer-tools › article

[ARTICLE · art-586] src=dev.to ↗ pub=2026-05-19T00:27Z topic=developer-tools verified=true sentiment=↑ positive

Build MCP Servers that don't suck...tokens.

First-generation MCP servers, which wrapped REST APIs for tools like Jira and GitHub, caused excessive token usage and context bloat. It introduces the "ultra-mcp-toolkit," which uses strategies like trimming response fields, consolidating multiple tools into one, and using a CLI-based approach to dramatically reduce token consumption—for example, dropping a Jira ticket payload from 270 KB to 15.5 KB and tool listings from ~10k tokens to ~100 tokens. The toolkit also provides a Claude Code skill to automate these optimizations, making MCP servers more cost-efficient and reducing hallucinations.

read5 min views10 publishedMay 19, 2026

First-generation MCP servers were great. They gave AI agents access to a ton of external apps and data — Jira, Confluence, GitHub, Linear, you name it. But most of them just wrapped REST APIs. And that causes a ton of context bloat, hallucinations, and token burning.

Combining a few strategies from the ultra-mcp-toolkit, you can reduce that bloat dramatically — and save money.

Generating a cost-efficient MCP server is easy. Just install the skill and off you go.

Here's what "dramatically" looks like #

Real benchmark, live Jira instance, reproducible:

Per-call response size

scenario	naive	with toolkit	savings
fetch 1 simple ticket	20.3KB	1.2KB	17.5×
investigate rich ticket	270.7KB	15.5KB	17.5×
JQL search ~10 tickets	20.5KB	3.5KB	5.8×

That rich-ticket row is the one that hurts. 270 KB → 15.5 KB. ~67k tokens down to ~3.9k tokens. Same content; the full payload still lands on disk and the agent can fetch it via a ref:

path only if it actually needs the detail.

Tool-list cost (paid every conversation)

approach	bytes	~tokens	savings
naive (one tool per op)	38.9KB	9,947	1×
consolidated tools	25.1KB	6,427	1.5×
consolidated + filtered	~6 KB	~1,600	5×
code-api mode
401B	100	99×

You read that right. Tool listings drop from ~10k tokens to ~100 tokens. On every. single. conversation.

Why MCP servers leak tokens #

Four anti-patterns show up almost everywhere:

Returning raw API JSON. A Jira issue carriesiconUrl

s, nestedself

URLs, schema metadata, expand hints, three different shapes of the same status field. The agent needs none of it. - One MCP tool per endpoint. A typical CRM has ~80 endpoints → 80 tool descriptions in the listing → ~10k tokens before the user types anything. - Asking the LLM to filter or paginate. The model can't reliably page through huge structures, and the chunking logic itself costs tokens. Filtering belongs server-side. - No discipline on what gets kept. Denylist trimming (delete result.iconUrl

) silently breaks the day the API adds a new noisy field. Allowlists keep the contract stable.

The fix, in three strategies #

1. Allowlist-style trim projections

import { pick } from "ultra-mcp-toolkit/trim";

const issueSummary = (raw) => {
  const r = raw as { key: string; fields: Record<string, unknown> };
  return {
    key: r.key,
    ...pick(r.fields, ["summary", "status", "priority", "assignee"]),
  };
};

Register the trim once. Every response routes through it. New API fields default to dropped. The model sees what it needs; the full response lives on disk as a ref:

the agent can dereference on demand.

2. Consolidated tools (action-discriminated)

Instead of 80 tools, expose ~15 — each taking an action

arg:

{ action: "get", issueIdOrKey: "PROJ-1" }
{ action: "create", projectKey: "PROJ", summary: "..." }
{ action: "transition", issueIdOrKey: "PROJ-1", transition: "Done" }

Same operations, 1/5th the tool-list cost. The toolkit's dispatcher handles per-action Zod validation, manifest routing, and a full: true

escape hatch when the model genuinely needs the raw response.

3. Code-api mode (the 99× lever)

Expose a single MCP tool that hands the agent a path to a bundled CLI plus a socket address:

node <cli-path> issue.get --issueIdOrKey=PROJ-1

The agent drives the whole API from its shell. Tool list stays at one tool forever, no matter how many operations exist. For shell-capable agents (Claude Code, Cursor, anything with bash), it's pure win.

Quick start #

npm install ultra-mcp-toolkit

The toolkit ships a Claude Code skill that auto-loads when you work on an MCP server. Install it:

npm run install-skill

That's it. The skill walks the agent through manifest design, trim projections, dispatcher wiring, and server boot — the patterns that produce the numbers above.

Working from a non-Claude agent (Codex CLI, Cursor, Aider, Continue, Zed)? Point it at the skill markdown directly — AGENTS.md shows you how.

What's in the box #

Operation manifest— declare endpoints as pure data; powers MCP tools, CLI, and code-api bridge from one source of truth. - Trim registry— type-safe allowlist projections. - Content-addressed sandbox— full responses land on disk; the model sees aref:

only. - Page cache— versioned-id disk cache for stable keys (PR diffs by SHA, Confluence pages by version). - Pooled retry-aware HTTP transport—undici

429-aware retry honoringRetry-After

. - Atomic streaming downloads— sha256-verified, path-traversal-safe. - Consolidated tool dispatcher— Zod-validated, action-discriminated. - CLI scaffolding— bridge mode + direct mode, free withcreateCli

. - Bundled Claude Code skill— installs in one command.

Production proof #

Used in ultra-jira-mcp and ultra-bitbucket-mcp. The benchmark numbers above come from the Jira server running against a real Jira Cloud instance — every byte measured is one a production agent would actually receive.

If you're building an MCP server for any enterprise API — Jira, Confluence, GitHub, Linear, Notion, ServiceNow, Salesforce, whatever — and your token bill or context window is starting to bite, give it a try.

⭐ ** github.com/scottlepp/ultra-mcp-toolkit** — issues, PRs, and benchmark contributions welcome.

What's the most token-bloated MCP server you've shipped or seen? Drop it in the comments — I'm collecting horror stories.

source & further reading

dev.to — original article Is Speculative Decoding's Speedup a Hardware Problem or a Model Problem? I Built a Chrome Extension to Replace Notion's Broken Web Clipper Claude Opus 5 leads on agentic work — and undercuts Fable 5 on cost

~/api · this article 200

$curl api.wpnews.pro/v1/news/build-mcp-servers-that-d…

Read original on dev.to → dev.to/scottlepp/build-mcp-servers-that-dont-suc…

mentioned entities

MCP

Jira

Confluence

GitHub

Linear

ultra-mcp-toolkit

metadata

slugbuild-mcp-servers-that-don-t-suck-tokens

topic#developer-tools

secondary3 topics

sentimentpositive

canonicaldev.to

navigation

← prev13 Weeks before College, 13 Proj…

next →Most AI Tools Are Just LLM Wrapp…

── more in #developer-tools 4 stories · sorted by recency

dev.to · 25 Jul · #developer-tools

Building a desktop client for an AI coding agent

promptcube3.com · 25 Jul · #developer-tools

ReadmeAI: Automating Your Project Documentation

dev.to · 25 Jul · #developer-tools

How We Solved Agent Auth Without a Single PAT

promptcube3.com · 24 Jul · #developer-tools

Monday.com Pivot: Trading Headcount for AI Agents

── more on @mcp 3 stories trending now

wpnews · 24 Jul · #artificial-intelligence

A $700 Billion Sovereign Fund Just Made the Chinese AI Cost Argument Impossible to Ignore

wpnews · 24 Jul · #artificial-intelligence

SK Hynix reports Q2 2026 earnings as the AI memory supercycle faces its first real test

wpnews · 24 Jul · #artificial-intelligence

As agentic AI inference surges, tokenomics becomes the enterprise’s defining budget constraint

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required