{"slug": "mcp-solves-the-plug-not-the-trust-boundary", "title": "MCP Solves the Plug, Not the Trust Boundary", "summary": "The Model Context Protocol (MCP) standardizes how AI applications connect to tools and data, but it fails to solve the problem of tool selection when systems scale to dozens or hundreds of tools. As AI assistants connect to more servers, the protocol's `tools/list` command forces models to load every tool definition into their context window, consuming tokens and increasing the likelihood of selecting a plausible but wrong tool. The industry now needs a `tools/search` primitive that filters tools by relevance, intent, and context, rather than relying on the current approach of dumping all available tools into the model's active context.", "body_md": "[MCP Operations](https://vectoralix.com/blog/category/mcp-operations)\n\n# MCP Has a Tool-Selection Problem\n\nThe Model Context Protocol is often explained as a standard way for AI applications to connect to tools, data, and external systems.\n\nThat part is useful. Before MCP, every AI application had to invent its own way to talk to files, APIs, databases, calendars, issue trackers, logs, and internal services. A shared protocol is a real improvement.\n\nBut once you connect more than a few servers, a different problem appears.\n\nThe hard part is no longer “can the model call a tool?”\n\nThe hard part becomes:\n\nWhich tool should the model even see?\n\nThat sounds like a small detail. It is not.\n\n## The naive version works\n\nThe simple MCP workflow is easy to understand.\n\nA client connects to a server. The client asks for the tools. The server returns a list of tools. Each tool has a name, a description, and an input schema. The model can then decide which tool to call.\n\nFor a demo, this works well.\n\nA server with three tools is easy:\n\n`get_weather`\n\n`search_files`\n\n`create_ticket`\n\nThe model sees all three. The user asks a question. The right tool is usually obvious.\n\nThis is the happy path most examples show.\n\nThe problem starts when the system succeeds.\n\n## Success means too many tools\n\nA useful AI assistant does not stay connected to one tiny server.\n\nIt gets connected to GitHub. Then Slack. Then Google Drive. Then Jira. Then a database. Then internal logs. Then billing. Then a deployment system. Then a few company-specific tools nobody outside the team understands.\n\nSuddenly the model is not choosing between three tools.\n\nIt is choosing between 30, 80, or 200.\n\nEach of those tools needs a name. Each needs a description. Each needs a schema. Many need nested properties, enums, examples, constraints, and warnings.\n\nAll of that must be represented somehow for the model.\n\nThis creates a context tax.\n\nBefore the model answers the user, before it reads the actual task, before it reasons about anything useful, the context window is already filled with tool definitions that may not matter for the current turn.\n\nMost of the time, the user does not need 200 tools.\n\nThey need one.\n\n## Tool definitions are not free\n\nTool definitions look small when viewed one at a time.\n\nA name here. A description there. A JSON schema. A few parameters.\n\nBut LLMs do not experience tool definitions one at a time. They experience the whole available tool surface as context.\n\nThat matters for two reasons.\n\nFirst, tool definitions consume tokens. The model has less room for the actual conversation, project files, logs, code, or documents.\n\nSecond, too many tools make selection harder. A larger tool list does not only cost more. It also creates more opportunities for the model to choose a plausible but wrong tool.\n\nThis is especially painful because many real tools are semantically close.\n\nIs the right tool `search_documents`\n\n, `search_files`\n\n, `query_knowledge_base`\n\n, `find_resource`\n\n, `list_pages`\n\n, or `get_record`\n\n?\n\nA human can ask a clarifying question or inspect the system. A model often picks the tool that sounds closest.\n\nSometimes that is enough. Sometimes it is not.\n\n## MCP standardizes listing, not relevance\n\nThis is the core weakness.\n\nMCP gives clients a standard way to ask a server what tools it has.\n\nIt does not give the client a standard way to ask:\n\n- Which tools are relevant to this user request?\n- Which tools should be shown for this intent?\n- Which tool schemas can be loaded later?\n- Which tools are commonly confused with each other?\n- Which tools are safe to expose in this context?\n- Which tools should be hidden unless specifically requested?\n\nThe protocol has `tools/list`\n\n.\n\nWhat many LLM applications need is closer to `tools/search`\n\n.\n\nNot search as a product feature. Search as a context-management primitive.\n\nA model should not need to carry every possible tool definition in its active context just because the user might need one of them later.\n\n## Anthropic already worked around this\n\nA good sign that something is missing at the protocol layer is when people solve it outside the protocol.\n\nAnthropic’s Tool Search/deferred-tools pattern is exactly that kind of signal.\n\nInstead of loading every tool definition upfront, the model sees a small search tool. When it needs a capability, it searches for relevant tools. Only then are the matching tool definitions loaded into context.\n\nThat is a sensible design.\n\nIt treats tool discovery as its own step instead of assuming every tool must be present from the beginning.\n\nBut this also highlights the gap: if every host or model provider solves tool discovery differently, MCP becomes a standard for exposing tools, while relevance remains vendor-specific.\n\nThat is not fatal. It is normal for young protocols.\n\nBut it is the next problem to solve.\n\n## Tool search should be boring\n\nA protocol-level tool search primitive does not need to be complicated.\n\nIt could start with something simple:\n\n```\n{\n  \"method\": \"tools/search\",\n  \"params\": {\n    \"query\": \"create a GitHub issue for this bug\",\n    \"limit\": 5\n  }\n}\n```\n\nThe server could return a ranked subset of tools:\n\n```\n{\n  \"tools\": [\n    {\n      \"name\": \"github.create_issue\",\n      \"title\": \"Create GitHub issue\",\n      \"description\": \"Create a new issue in a repository\"\n    },\n    {\n      \"name\": \"github.search_issues\",\n      \"title\": \"Search GitHub issues\",\n      \"description\": \"Search existing issues before creating a duplicate\"\n    }\n  ]\n}\n```\n\nThen the client could request full schemas only for the tools it wants to expose to the model.\n\nThere are many ways to design this. The exact shape matters less than the principle:\n\nTool discovery should be lazy, relevant, and explicit.\n\nThe current default is too eager.\n\n## Lazy schemas would help too\n\nThere is another version of the same idea: deferred schema loading.\n\nThe model may need to know that a tool exists before it needs the entire input schema.\n\nFor example, the model might only need this at first:\n\n```\n{\n  \"name\": \"billing.explain_invoice\",\n  \"description\": \"Explain an invoice status for a customer\"\n}\n```\n\nOnly if the model chooses that tool does it need the full schema:\n\n```\n{\n  \"customer_id\": \"string\",\n  \"invoice_id\": \"string\",\n  \"include_line_items\": \"boolean\"\n}\n```\n\nThat distinction matters.\n\nHumans do this all the time. We scan names first. We inspect details only when something looks relevant.\n\nLLM tools should work the same way.\n\n## This is not only about cost\n\nIt is tempting to frame this as a token-cost problem.\n\nThat is part of it, but not the most interesting part.\n\nThe bigger issue is reliability.\n\nA model with fewer irrelevant tools has fewer chances to make the wrong call. A model that sees only the relevant schema has less noise to interpret. A model that discovers tools intentionally can explain why a tool was selected.\n\nThis also helps debugging.\n\nIf an agent fails today, you may not know whether the tool was bad, the description was bad, the schema was confusing, or the model simply got distracted by a different tool that looked similar.\n\nA relevance step makes the process more inspectable.\n\nThe system can log:\n\n- the user request\n- the tool search query\n- the candidate tools returned\n- the tool definitions loaded\n- the final tool call\n\nThat is much easier to reason about than one giant prompt containing every tool the user might ever need.\n\n## The problem is worse for voice agents\n\nTool selection latency is annoying in chat.\n\nIn voice, it is brutal.\n\nA chat assistant can pause for a few seconds and still feel usable. A voice agent that waits silently feels broken.\n\nIf the model has to process a large tool surface before deciding what to do, every turn becomes heavier. If the selected tool then returns a large result inline, the assistant may wait even longer before it can speak.\n\nRealtime systems need partial progress.\n\nThey need the assistant to say something useful while work continues.\n\nThey need tool discovery and tool results to avoid dumping unnecessary context into the model.\n\nMCP is moving in that direction with discussions around better result types, streaming, and reference-based results. But tool relevance is the earlier problem. Before a tool returns too much data, the model first has to select the right tool.\n\n## A protocol is allowed to be incomplete\n\nNone of this means MCP is bad.\n\nMCP solved a real integration problem. It gave the ecosystem a common shape for exposing tools, resources, and prompts to AI applications.\n\nThat is valuable.\n\nBut protocols often expose the next layer of problems after they solve the first one.\n\nHTTP made web integration easier, then caching, authentication, compression, security headers, and content negotiation became important. SQL standardized querying, then indexing, planning, permissions, migrations, and replication became important.\n\nMCP standardized the tool boundary.\n\nNow tool discovery needs to grow up.\n\n## What I would like to see\n\nThe smallest useful addition would be a relevance-aware tool discovery primitive.\n\nSomething like:\n\n`tools/search`\n\n- intent-filtered\n`tools/list`\n\n- deferred schema loading\n- ranked tool subsets\n- tool groups or namespaces\n- a standard way to ask for “tools matching this task”\n\nThe exact method name is not important.\n\nThe important part is that clients should not have to choose between two bad defaults:\n\n- Load every tool definition into the model context.\n- Invent a vendor-specific relevance layer outside MCP.\n\nA protocol for model-context should care deeply about what enters the model context.\n\nTool definitions are context.\n\nThat means tool discovery is not a secondary feature.\n\nIt is part of the core design problem.\n\n## The future MCP assistant should not see everything\n\nThe best AI assistants will not be the ones connected to the most tools.\n\nThey will be the ones that can find the right tool at the right moment, load only the context they need, call it safely, and explain what happened.\n\nMCP already gives us a standard way to expose tools.\n\nThe next step is giving models a standard way to not see all of them at once.\n\n## Comments\n\nNo comments yet. Be the first to share your thoughts.", "url": "https://wpnews.pro/news/mcp-solves-the-plug-not-the-trust-boundary", "canonical_source": "https://vectoralix.com/blog/mcp-has-a-tool-selection-problem", "published_at": "2026-06-12 07:11:23+00:00", "updated_at": "2026-06-12 07:49:56.028456+00:00", "lang": "en", "topics": ["ai-tools", "large-language-models", "ai-agents", "ai-infrastructure"], "entities": ["Model Context Protocol", "MCP", "GitHub", "Slack", "Google Drive", "Jira"], "alternates": {"html": "https://wpnews.pro/news/mcp-solves-the-plug-not-the-trust-boundary", "markdown": "https://wpnews.pro/news/mcp-solves-the-plug-not-the-trust-boundary.md", "text": "https://wpnews.pro/news/mcp-solves-the-plug-not-the-trust-boundary.txt", "jsonld": "https://wpnews.pro/news/mcp-solves-the-plug-not-the-trust-boundary.jsonld"}}