{"slug": "an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-here-s-the-30-that", "title": "An MCP server can vanish from your AI agent mid-conversation. Here's the 30-second timeout that did it to me.", "summary": "A developer discovered that their MCP server for Safari browser control could silently disappear from an AI agent's tool catalog mid-conversation due to a 30-second initialization timeout. The `safari-mcp` server's top-level await for profile detection sometimes exceeded the 30-second handshake deadline, causing the client to kill the server without warning the user or the agent, which then continued operating with an incomplete tool catalog. The failure was invisible to users, as the agent simply reimplemented missing functionality with alternative tools like `Bash` and `curl` instead of reporting the missing capabilities.", "body_md": "The bug report was: \"the browser tools are gone.\"\n\nI'd been running the same Claude Code session for an hour, calling `safari_navigate`\n\n, `safari_click`\n\n, `safari_read_page`\n\n— the usual flow. Then I opened a new conversation in the same project and the safari tools weren't in the catalog at all. The agent didn't say \"I tried to use safari-mcp and it's not available.\" It just… didn't use them. It re-implemented half of what I needed with `Bash`\n\nand `curl`\n\n.\n\nThat second part is the worst part. The agent doesn't *know* that the tool catalog is incomplete. It only knows what's in front of it. If a tool is missing, it makes do with what it has — and the user has no idea their last release broke discoverability.\n\nThis post is about the 30-second timeout that caused it, the diagnosis path, and the one-line fix. But more than that, it's about a failure mode in stdio MCP that I think every MCP author needs to know about and most don't.\n\n`safari-mcp`\n\nis an MCP server that drives the real macOS Safari. When the user wants their agent to use a separate browser profile (e.g. \"Work\" vs \"Personal\"), they launch the server with `SAFARI_PROFILE=work`\n\nand the server scopes every tool call to that profile's window. That means at startup the server has to find the window — call AppleScript, enumerate Safari's open windows, match by profile name, cache the window ref.\n\nHere's what the startup code used to do:\n\n``` js\nif (SAFARI_PROFILE) {\n  await new Promise(r => setTimeout(r, 50));\n  await refreshTargetWindow(true);   // <-- this line\n  if (_targetWindowRef) {\n    _logProfile(`Startup: Profile \"${SAFARI_PROFILE}\" → ${_targetWindowRef}`);\n  } else {\n    _logProfile(`WARNING: Profile \"${SAFARI_PROFILE}\" window NOT found`);\n  }\n}\n```\n\nES module top-level await. Looks fine. Profile detection runs once, the server knows which window to target, life is good.\n\nIn testing this took ~50–200ms. In production it sometimes took longer than 30 seconds.\n\nWhen Claude Code launches an MCP server it expects an `initialize`\n\nresponse within 30 seconds. That's the handshake — the server announces its protocol version and tool catalog, the client says \"ok, here's my session.\" Until that handshake completes, the server's tools don't enter the conversation's tool catalog.\n\nIf your top-level await runs `>30s`\n\nbefore the stdio loop gets a chance to respond, the handshake misses the deadline. The client gives up. The server is killed. No retry. No warning surfaced to the user, just a log entry deep in the Claude Code internals that says \"MCP server failed to initialize in 30s.\"\n\nAnd critically: **the conversation continues**. The agent's tool catalog is whatever responded in time. Safari tools just aren't there. The agent has no way to know they were *supposed* to be there.\n\nI want to underline this: the failure was completely invisible to me as a user. I didn't see a stack trace. I didn't see a \"your tools didn't load.\" I saw an agent that didn't reach for the tools I'd just shipped a fix for.\n\n`refreshTargetWindow(true)`\n\ncalls into a Swift helper that runs:\n\n```\ntell application \"Safari\"\n  return name of every window\nend tell\n```\n\nOn a fresh Safari with three tabs this returns in 12ms. On a real user's Safari, it does any of the following:\n\n`name of every window`\n\nwaits for each window's title to settle. 5–20s.`~/Library/Containers/com.apple.Safari/`\n\n. The TCC privacy subsystem reverifies your bundle's automation permission. Anywhere from instant to \"until the user moves their mouse.\"None of those are bugs. They are normal macOS behavior. They were not in the 99th percentile when I tested — they showed up in the 99.9th percentile when the server hit my actual user base.\n\nThe first thing I did was assume the bug was in the MCP protocol layer. I went and looked at the stdio framing code, the JSON-RPC parser, the request dispatcher. None of it was the problem.\n\nThe second thing I did was look at the `refreshTargetWindow`\n\ncall and think \"well, it works in my testing.\" Which is the most expensive sentence in software.\n\nThe actual diagnostic, which took me about 20 minutes to find, was to read the Claude Code MCP debug logs:\n\n```\n[MCP] safari-mcp: spawned (pid 47192)\n[MCP] safari-mcp: sending initialize request\n[MCP] safari-mcp: initialize timed out after 30000ms, killing process\n```\n\nThat's it. That's the only signal. The MCP client doesn't tell you *what* the server was doing. It doesn't ask the server \"are you stuck?\" It just kills it.\n\nOnce I had that line, the rest was obvious: the only thing that runs before stdio is `refreshTargetWindow`\n\n. If `refreshTargetWindow`\n\nis slow, stdio never gets a chance. Therefore: don't block stdio on it.\n\n``` js\nif (SAFARI_PROFILE) {\n  (async () => {\n    await new Promise(r => setTimeout(r, 50));\n    await refreshTargetWindow(true);\n    if (_targetWindowRef) {\n      _logProfile(`Startup: Profile \"${SAFARI_PROFILE}\" → ${_targetWindowRef}`);\n    } else {\n      _logProfile(`WARNING: Profile \"${SAFARI_PROFILE}\" window NOT found`);\n    }\n  })();\n}\n```\n\nWrap the whole startup probe in a fire-and-forget IIFE. Module init returns immediately. Stdio loop binds. Initialize handshake responds in ~5ms. By the time the first `safari_*`\n\ntool call arrives, the profile window probe has usually finished — and if it hasn't, `getTargetWindowRef()`\n\nalready has a lazy-refresh path that handles a missing cache by running the probe inline.\n\nThe correctness story is: the probe is a cache warm-up, not a hard prerequisite. The tool call path already knows how to handle a cold cache. So there is no reason to make module init wait.\n\nThree lines changed. The bug is gone.\n\nIf your MCP server does *anything* at startup that touches an external process, an external API, the filesystem outside your bundle, or a system service, **you cannot let it block initialize**.\n\n`tools/list`\n\nor first tool call, not at import.`xprop`\n\n? You have no SLA on those. Defer.`~/.config/<your-tool>`\n\nfile? Pretty safe. But still: if it's gone or corrupted, log and continue; don't crash module init.The asymmetric cost matters here. If your slow probe blocks `initialize`\n\n, the failure mode is the worst possible kind: silent absence of your tools, no error surfaced, agent doesn't know to retry. If your slow probe runs in the background and a tool call arrives before it finishes, the failure mode is at most a single slow tool call — and you can return a clear error message that the agent can see and act on.\n\nThere is no version of this trade where blocking is the right answer.\n\nThis bug shouldn't be possible to ship. Some specific changes I'd love to see:\n\n`initialize`\n\nshould not block on tool catalog completeness.`tools/list?ttl=lazy`\n\n, ask me later.\"`package.json`\n\nwill be unavailable this session.\" The agent reads that, the user reads that, everyone can act on it.`initialize`\n\ntimeouts should generate a retry, not a kill.None of those are the server author's problem to fix, but all of them would have caught this bug for me before a user did.\n\nWe are heading toward agents that ship with dozens of MCP servers. The probability that at least one of them silently fails to initialize on any given launch goes from \"small\" to \"nearly certain\" once you're stacking 10+ servers. If the failure mode is \"agent silently lacks tools it should have,\" the user experience for AI agents becomes \"sometimes the AI is dumber than usual and we don't know why.\"\n\nThat's not a future I want. It's a future where users blame the model for missing capabilities that the harness didn't even surface.\n\nIf you're shipping an MCP server: audit your top-level await. If it touches anything that could stall, move it off the critical path. Today. Before someone files the bug report I just filed against myself.\n\n*The fix shipped in safari-mcp v2.11.9 today. The full diff is here. The project is on GitHub if you want to see how an MCP server scopes itself to a Safari profile, or to file the next bug.*", "url": "https://wpnews.pro/news/an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-here-s-the-30-that", "canonical_source": "https://dev.to/achiya-automation/an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-heres-the-30-second-timeout-that-did-3khe", "published_at": "2026-05-28 08:56:52+00:00", "updated_at": "2026-05-28 09:23:40.171390+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "ai-products", "ai-infrastructure"], "entities": ["Claude Code", "Safari", "MCP", "AppleScript", "safari-mcp"], "alternates": {"html": "https://wpnews.pro/news/an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-here-s-the-30-that", "markdown": "https://wpnews.pro/news/an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-here-s-the-30-that.md", "text": "https://wpnews.pro/news/an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-here-s-the-30-that.txt", "jsonld": "https://wpnews.pro/news/an-mcp-server-can-vanish-from-your-ai-agent-mid-conversation-here-s-the-30-that.jsonld"}}