I built an MCP server that publishes HTML files, and I hit a wall I haven't
seen documented anywhere: you can't pass a large file as an MCP tool argument. Not "it's slow" or "it's awkward" — the model is physically
Here's the failure, why it happens, and the one-line design change that fixes it.
My agents (Claude Code, mostly) generate a lot of interactive HTML — dashboards
with Chart.js, data-heavy reports, PRDs. I wanted them to publish those files to
the web with one tool call, so I did the obvious thing first:
server.tool(
"publish_html",
{ html: z.string(), title: z.string() },
async ({ html, title }) => {
const url = await upload(html, title);
return { content: [{ type: "text", text: url }] };
}
);
The agent generates the HTML, passes it as the html
argument, the server
uploads it. It demoed beautifully on a 5 KB "hello world" page.
Then I tried it on a real artifact — a 1.4 MB dashboard with inlined data —
and it fell apart.
When a model calls a tool, the arguments aren't a file handle or a pointer.
They are text the model emits, token by token, inside its response. A tool
call's arguments are part of the model's output, which means they're bounded by
the model's maximum output tokens.
Do the math: 1 MB of HTML is roughly 250k–350k tokens. Typical max output is
far below that. The model literally cannot finish "saying" the argument. In
practice you get one of:
And even when a file fits, you're paying output-token prices (the expensive
ones) to make the model retype a file that already exists on disk, byte for
byte, with a nonzero chance of it "fixing" something along the way.
This isn't an MCP bug. It's the nature of tool calling: arguments are model output. Any MCP tool designed to receive bulk content as an argument has a
The server runs locally over stdio. It has the same filesystem the agent is
working in. So the tool takes a path:
server.tool(
"publish_file",
{
path: z.string().describe("Absolute path to the HTML file to publish"),
title: z.string(),
},
async ({ path, title }) => {
const html = await fs.readFile(path, "utf8"); // server reads from disk
const url = await uploadMultipart(html, title); // server does the upload
return { content: [{ type: "text", text: url }] };
}
);
Now the model's tool call is ~50 tokens regardless of file size:
{ "path": "/Users/me/reports/q2-dashboard.html", "title": "Q2 Dashboard" }
The agent never carries the bytes. It writes the file with its normal file
tools (which stream and don't have this constraint the same way), then hands
the reference to the MCP server, which reads from disk and does a multipart
upload itself. 1.4 MB or 14 MB — the model's job is the same size.
This one design decision is the difference between a demo that works on toy
files and a tool that's useful on real artifacts.
If you're building an MCP server, audit every tool argument and ask: could this be big? If yes, take a reference instead:
| Instead of accepting… | Accept… |
|---|---|
| file contents | a file path |
| a big dataset | a path, URL, or query the server executes |
| an image/binary blob | a path or URL |
| "the whole document" to edit | a path + edit instructions |
The corollary for outputs is the same: a tool that returns a huge payload
fills the model's context window. Return a reference (a URL, a path, a
summary + handle) and let the model fetch slices if it needs them.
Local stdio servers are perfect for this pattern because they share a
filesystem with the agent. For remote MCP servers you'd reach for the same
idea with URLs or pre-signed uploads — anything but the bytes-in-arguments
trap.
The server is stelaspace-mcp
on npm (MIT) — it publishes HTML files to
StelaSpace, which gives each artifact a permanent,
sandboxed, access-controlled link (I built it; free tier exists). One-line
setup with Claude Code:
claude mcp add stelaspace --scope user \
--env STELASPACE_API_KEY=ss_sk_... \
-- npx -y stelaspace-mcp
Here's a live dashboard published this way —
1 MB+ of interactive Chart.js HTML that no model could ever have passed as a
tool argument.
If you've hit other walls building MCP servers, I'd genuinely like to hear
them — I'm collecting these gotchas.