When MCP tools break, isolate the server before blaming the agent A developer debugging an MCP (Model Context Protocol) integration found that an LLM agent's self-report of unavailable tools was misleading. The agent claimed that new stdio MCP servers were not connected, but the real issue was a malformed test invocation, not server failure. The developer advises isolating the server layer and using direct connectivity checks like the MCP Inspector instead of relying on the agent's narration. You add a new MCP server, or a new tool to an existing one. You ask the agent to use it and it says the tool isn't available — or worse, it confidently reports that "only the HTTP server is connected" and the new stdio servers aren't live. The instinct is to blame the MCP server or bounce the runtime. Resist it. The agent's narration about its own toolset is not ground truth, and a malformed test invocation will happily manufacture a fake "MCP unreachable" symptom. Here's the broke → tried → fixed of a debugging session that burned time on exactly this. We'd just added a second stdio MCP server and a new tool. To smoke-test it, we hand-typed a quick probe at the agent runtime and asked it to use the new capability. The agent came back with this: Those tools aren't in my live toolset right now — only the Home Assistant HTTP/SSE server is connected. I'll read the files directly instead. It then shelled out to read files off disk to answer the question — being resourceful, which made it look like the tools genuinely weren't wired up. Two new stdio servers, both apparently invisible to the session, while the one HTTP/SSE server was fine. That sure reads like the stdio MCP servers aren't connecting. That self-report is the trap. Treating "the agent says the tools aren't available" as a diagnosis — instead of a symptom — is what sent us down the wrong path. There are two failure modes here and they look identical from the agent's chair: The agent can't tell these apart for you, and it will narrate either one as "tools not available." Worse, an LLM describing its own capabilities is itself a confabulation risk — it pattern-completes a plausible story "only the HTTP server is live" around the fact that some tools didn't show up. That story is not evidence about server health. So the discipline is: don't ask the agent whether the server is up. Ask the server. Isolate the layer before you touch anything. We took the self-report at face value. The agent said the stdio servers weren't connected, so we assumed they weren't connected. This is the root mistake every later dead-end inherits from — we let the agent's narration stand in for a connectivity check. It is not one. "Tools didn't load? Refresh them — restart the runtime." So we bounced the long-running agent gateway daemon, the shape of which is: generic shape — restart the long-running agent runtime/gateway