cd /news/ai-agents/your-mcp-server-can-t-take-a-file-as… · home topics ai-agents article
[ARTICLE · art-24964] src=dev.to pub= topic=ai-agents verified=true sentiment=· neutral

Your MCP server can't take a file as an argument — here's why, and the fix

A developer building an MCP server for publishing HTML files discovered that large files cannot be passed as tool arguments because the model must emit the entire argument token by token, hitting output token limits. A 1.4 MB dashboard failed where a 5 KB test succeeded, revealing that tool arguments are bounded by the model's maximum output tokens. The fix replaces file contents with a file path reference, reducing the tool call to roughly 50 tokens regardless of file size.

read4 min publishedJun 12, 2026

I built an MCP server that publishes HTML files, and I hit a wall I haven't

seen documented anywhere: you can't pass a large file as an MCP tool argument. Not "it's slow" or "it's awkward" — the model is physically

Here's the failure, why it happens, and the one-line design change that fixes it.

My agents (Claude Code, mostly) generate a lot of interactive HTML — dashboards

with Chart.js, data-heavy reports, PRDs. I wanted them to publish those files to

the web with one tool call, so I did the obvious thing first:

server.tool(
  "publish_html",
  { html: z.string(), title: z.string() },
  async ({ html, title }) => {
    const url = await upload(html, title);
    return { content: [{ type: "text", text: url }] };
  }
);

The agent generates the HTML, passes it as the html

argument, the server

uploads it. It demoed beautifully on a 5 KB "hello world" page.

Then I tried it on a real artifact — a 1.4 MB dashboard with inlined data —

and it fell apart.

When a model calls a tool, the arguments aren't a file handle or a pointer.

They are text the model emits, token by token, inside its response. A tool

call's arguments are part of the model's output, which means they're bounded by

the model's maximum output tokens.

Do the math: 1 MB of HTML is roughly 250k–350k tokens. Typical max output is

far below that. The model literally cannot finish "saying" the argument. In

practice you get one of:

And even when a file fits, you're paying output-token prices (the expensive

ones) to make the model retype a file that already exists on disk, byte for

byte, with a nonzero chance of it "fixing" something along the way.

This isn't an MCP bug. It's the nature of tool calling: arguments are model output. Any MCP tool designed to receive bulk content as an argument has a

The server runs locally over stdio. It has the same filesystem the agent is

working in. So the tool takes a path:

server.tool(
  "publish_file",
  {
    path: z.string().describe("Absolute path to the HTML file to publish"),
    title: z.string(),
  },
  async ({ path, title }) => {
    const html = await fs.readFile(path, "utf8"); // server reads from disk
    const url = await uploadMultipart(html, title); // server does the upload
    return { content: [{ type: "text", text: url }] };
  }
);

Now the model's tool call is ~50 tokens regardless of file size:

{ "path": "/Users/me/reports/q2-dashboard.html", "title": "Q2 Dashboard" }

The agent never carries the bytes. It writes the file with its normal file

tools (which stream and don't have this constraint the same way), then hands

the reference to the MCP server, which reads from disk and does a multipart

upload itself. 1.4 MB or 14 MB — the model's job is the same size.

This one design decision is the difference between a demo that works on toy

files and a tool that's useful on real artifacts.

If you're building an MCP server, audit every tool argument and ask: could this be big? If yes, take a reference instead:

Instead of accepting… Accept…
file contents a file path
a big dataset a path, URL, or query the server executes
an image/binary blob a path or URL
"the whole document" to edit a path + edit instructions

The corollary for outputs is the same: a tool that returns a huge payload

fills the model's context window. Return a reference (a URL, a path, a

summary + handle) and let the model fetch slices if it needs them.

Local stdio servers are perfect for this pattern because they share a

filesystem with the agent. For remote MCP servers you'd reach for the same

idea with URLs or pre-signed uploads — anything but the bytes-in-arguments

trap.

The server is stelaspace-mcp

on npm (MIT) — it publishes HTML files to

StelaSpace, which gives each artifact a permanent,

sandboxed, access-controlled link (I built it; free tier exists). One-line

setup with Claude Code:

claude mcp add stelaspace --scope user \
  --env STELASPACE_API_KEY=ss_sk_... \
  -- npx -y stelaspace-mcp

Here's a live dashboard published this way

1 MB+ of interactive Chart.js HTML that no model could ever have passed as a

tool argument.

If you've hit other walls building MCP servers, I'd genuinely like to hear

them — I'm collecting these gotchas.

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/your-mcp-server-can-…] indexed:0 read:4min 2026-06-12 ·