{"slug": "when-mcp-tools-break-isolate-the-server-before-blaming-the-agent", "title": "When MCP tools break, isolate the server before blaming the agent", "summary": "A developer debugging an MCP (Model Context Protocol) integration found that an LLM agent's self-report of unavailable tools was misleading. The agent claimed that new stdio MCP servers were not connected, but the real issue was a malformed test invocation, not server failure. The developer advises isolating the server layer and using direct connectivity checks like the MCP Inspector instead of relying on the agent's narration.", "body_md": "You add a new MCP server, or a new tool to an existing one. You ask the agent to use it and it says the tool isn't available — or worse, it confidently reports that \"only the HTTP server is connected\" and the new stdio servers aren't live. The instinct is to blame the MCP server or bounce the runtime. Resist it. The agent's narration about its own toolset is not ground truth, and a malformed test invocation will happily manufacture a fake \"MCP unreachable\" symptom. Here's the broke → tried → fixed of a debugging session that burned time on exactly this.\n\nWe'd just added a second stdio MCP server and a new tool. To smoke-test it, we hand-typed a quick probe at the agent runtime and asked it to use the new capability. The agent came back with this:\n\n```\nThose tools aren't in my live toolset right now — only the\nHome Assistant (HTTP/SSE) server is connected. I'll read the\nfiles directly instead.\n```\n\nIt then shelled out to read files off disk to answer the question — being resourceful, which made it *look* like the tools genuinely weren't wired up. Two new stdio servers, both apparently invisible to the session, while the one HTTP/SSE server was fine. That sure reads like the stdio MCP servers aren't connecting.\n\nThat self-report is the trap. Treating \"the agent says the tools aren't available\" as a diagnosis — instead of a symptom — is what sent us down the wrong path.\n\nThere are two failure modes here and they look identical from the agent's chair:\n\nThe agent can't tell these apart for you, and it will narrate either one as \"tools not available.\" Worse, an LLM describing its own capabilities is itself a confabulation risk — it pattern-completes a plausible story (\"only the HTTP server is live\") around the fact that some tools didn't show up. That story is not evidence about server health.\n\nSo the discipline is: **don't ask the agent whether the server is up. Ask the server.** Isolate the layer before you touch anything.\n\nWe took the self-report at face value. The agent said the stdio servers weren't connected, so we assumed they weren't connected. This is the root mistake every later dead-end inherits from — we let the agent's narration stand in for a connectivity check. It is not one.\n\n\"Tools didn't load? Refresh them — restart the runtime.\" So we bounced the long-running agent gateway daemon, the shape of which is:\n\n```\n# generic shape — restart the long-running agent runtime/gateway\n<runtime> gateway restart\n```\n\nThen re-ran the same probe. No change:\n\n```\nStill only the Home Assistant server is connected — the other\ntools aren't in my toolset.\n```\n\nRestarting a daemon to \"refresh tools\" is a guess, not a diagnosis. If the server were genuinely stale you'd want evidence of that *first*. We had none — we were rebooting on a hunch, and the hunch was wrong. (It also cost real time: a gateway restart drops every in-flight session.)\n\nWe re-ran the hand-typed probe a few more times, tweaking the wording, still getting \"tools not available.\" At this point it looks airtight: server's been restarted, agent still can't see the tools, therefore the server must be broken. Every signal pointed at the MCP server — and every signal was coming through the same malformed probe, so they were all the same false negative wearing different hats.\n\nStop reasoning through the agent. **Isolate the layer**, then reproduce the *exact* production invocation.\n\nSpawn the server fresh, outside any agent session, and ask it to enumerate its tools. Most runtimes ship a connectivity check that does exactly this; the vendor-neutral tool is the [MCP Inspector](https://modelcontextprotocol.io/docs/tools/inspector).\n\nIf your runtime has a built-in check (generic shape — it spawns the server fresh, bypassing the session):\n\n```\n<runtime> mcp test <server>\n```\n\nOr talk to the server directly with the MCP Inspector — no runtime involved at all. The CLI mode is perfect for a connectivity assertion:\n\n```\n# list the tools a stdio server actually exposes, runtime out of the picture\nnpx @modelcontextprotocol/inspector --cli \\\n  <your-server-launch-command> --method tools/list\n```\n\n(`npx @modelcontextprotocol/inspector`\n\nwith no `--cli`\n\nopens the browser UI at `http://localhost:6274`\n\n— connect, open the **Tools** tab, click **List Tools**.)\n\nThe output settles it immediately:\n\n```\nConnected.\nserver: signalk          → 7 tools discovered: read_sensor, battery_state,\n                            get_route, get_active_alarms, ...\nserver: vessel_knowledge → 4 tools discovered: find_equipment, get_equipment, ...\n```\n\nBoth servers connect and enumerate every tool. **The servers were fine the entire time** — through all three dead-ends. That one command would have saved the gateway restart and every ad-hoc re-probe.\n\nIf the server lists its tools standalone, the bug (if any) is in *how the session loads them*, not the server. So stop hand-typing variants and run the agent the way production runs it — same skill/profile selector, same flags, same transport.\n\nOur production path (a voice/MQTT bridge) launches the agent in oneshot mode with a **namespaced skill selector** and the query passed via the query flag. The malformed probe had used a bare, non-namespaced selector and the wrong flag — which loads a different toolset (or none of the stdio toolsets). The shape that matters:\n\n```\n# WRONG — ad-hoc probe: bare/non-namespaced selector, wrong flag.\n# Loads a different profile; stdio toolsets never attach.\n<runtime> chat -m <model> -z \"use the new tool\"\n\n# RIGHT — the exact production invocation: namespaced skill selector\n# (profile/<skill>) + the real query flag, oneshot.\n<runtime> chat -Q -s profile/<skill> -q \"use the new tool\"\n```\n\nRun the right one and the agent calls the tools on the first try — no restart, no file-reading workaround:\n\n```\n[tool] get_active_alarms → none active\n[tool] find_equipment(\"Bellmarine\") → bellmarine-ddw-10\n... rated temperature zones: ...\n```\n\nThe fix was a one-line change to the invocation. The server never moved.\n\n**The one rule that would have short-circuited all of it:** don't restart a daemon to \"refresh tools\" before the server is *proven* stale. Prove it with a standalone `mcp test`\n\n/ Inspector run first.\n\nThe order is the whole lesson: **server (standalone) → invocation (exact production form) → only then the runtime.** Don't bounce the daemon until the first two are ruled out.\n\nThis came out of wiring MCP servers into a ship's-computer agent for an all-electric charter catamaran — SignalK, vessel-knowledge, and tide/weather tools behind a local LLM. When the agent said it couldn't see them, the servers had been fine all along; the probe was wrong. The MCP servers behind it are open source: [github.com/sailingnaturali/signalk-mcp](https://github.com/sailingnaturali/signalk-mcp).", "url": "https://wpnews.pro/news/when-mcp-tools-break-isolate-the-server-before-blaming-the-agent", "canonical_source": "https://dev.to/clarkbw--/when-mcp-tools-break-isolate-the-server-before-blaming-the-agent-30bd", "published_at": "2026-06-24 12:03:08+00:00", "updated_at": "2026-06-24 12:09:10.045325+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "developer-tools"], "entities": ["MCP", "Home Assistant", "MCP Inspector"], "alternates": {"html": "https://wpnews.pro/news/when-mcp-tools-break-isolate-the-server-before-blaming-the-agent", "markdown": "https://wpnews.pro/news/when-mcp-tools-break-isolate-the-server-before-blaming-the-agent.md", "text": "https://wpnews.pro/news/when-mcp-tools-break-isolate-the-server-before-blaming-the-agent.txt", "jsonld": "https://wpnews.pro/news/when-mcp-tools-break-isolate-the-server-before-blaming-the-agent.jsonld"}}