Honestly, I didn't think tool discovery would be a problem.
I've built 10+ MCP servers now, 1,800 hours into my knowledge base project, and after 85 production outages, I can tell you this: bad tool discovery will kill your MCP server before bad code ever does.
Let me explain.
MCP is pretty straightforward, right? You implement tools/list
and tools/call
, and you're done. The AI figures it out, the user is happy.
Except that's not what actually happens.
I've spent three days debugging this one issue: my AI client knew I had a "search knowledge base" tool, but it kept calling the wrong tool name. It kept calling search_knowledgebase
instead of search_papers
. I added aliases. I improved descriptions. Nothing worked.
Then I realized the real problem: the LLM hallucinates tool names. It doesn't read your tool list—it guesses based on what it thinks the name should be. And when it guesses wrong, your server gets a tool not found
error, the AI panics, and the entire conversation dies.
I learned this the hard way. Three separate production outages, all because of bad discovery. Today I want to share what I fixed, the code I wrote, and how you can avoid the same pain.
Let me count the ways I've messed this up:
This is the big one. My tool is called search_papers
. The LLM thinks "search papers" → search_paper
(singular). Or search_knowledge
→ wrong again. Or find_paper
→ nope.
Every time it happens, you get:
Error: Tool not found: search_paper
Available tools: search_papers, get_paper, ...
Then the LLM tries again, gets it wrong again, and eventually gives up. User experience: zero stars.
When I started, my tool descriptions were:
"description": "Searches papers"
That's useless. The LLM doesn't know when to use it. It doesn't know what kind of search it is. It doesn't know what returns. I've had my AI client use my "search" tool when it should have used "get paper by id" because the description was bad.
It's not just tool names—it's parameter names too. My parameter is called query
, but the LLM keeps passing question
or search_term
. Same problem: error, retry, failure.
When I first started, if the tool name was wrong, I just returned an error. That's the spec, right? Well, the spec doesn't say you can't help the LLM find the right tool.
After three outages, I built a fuzzy discovery layer into my MCP server. Here's what it looks like in Java Spring Boot:
@Component
public class McpDiscoveryFilter implements OncePerRequestFilter {
private final List<ToolDefinition> tools;
private final FuzzyMatcher fuzzyMatcher;
public McpDiscoveryFilter(List<ToolDefinition> tools) {
this.tools = tools;
this.fuzzyMatcher = new FuzzyMatcher();
}
@Override
protected void doFilterInternal(
HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain
) throws ServletException, IOException {
// Only intercept tools/call
if (!request.getRequestURI().equals("/mcp/tools/call")) {
filterChain.doFilter(request, response);
return;
}
// Read the body
var body = request.getReader().lines().collect(Collectors.joining());
var callRequest = objectMapper.readValue(body, CallToolRequest.class);
String requestedName = callRequest.getName();
Optional<ToolDefinition> exactMatch = findExactMatch(requestedName);
if (exactMatch.isPresent()) {
// Exact match found, proceed normally
proceedWithRequest(request, response, filterChain, body, exactMatch.get());
return;
}
// No exact match — try fuzzy match
List<MatchResult> matches = fuzzyMatcher.findBestMatches(requestedName, tools);
if (matches.isEmpty() || matches.get(0).getScore() < 0.7) {
// Still no good — return helpful error
writeHelpfulError(response, requestedName, matches);
return;
}
// We found a close match — rewrite the request and proceed!
ToolDefinition bestMatch = matches.get(0).getTool();
logger.info("Fuzzy matched: {} -> {} (score: {})",
requestedName, bestMatch.getName(), matches.get(0).getScore());
// Rewrite with correct name and proceed
callRequest.setName(bestMatch.getName());
String rewrittenBody = objectMapper.writeValueAsString(callRequest);
proceedWithRewrittenBody(request, response, filterChain, rewrittenBody, bestMatch);
}
private Optional<ToolDefinition> findExactMatch(String name) {
return tools.stream()
.filter(t -> t.getName().equals(name))
.findFirst();
}
private void writeHelpfulError(
HttpServletResponse response,
String requestedName,
List<MatchResult> matches
) throws IOException {
Map<String, Object> error = new HashMap<>();
error.put("result", null);
error.put("error", Map.of(
"message", String.format("Tool '%s' not found. Did you mean one of these? %s",
requestedName,
matches.stream()
.limit(3)
.map(m -> m.getTool().getName())
.collect(Collectors.joining(", "))
),
"code", "TOOL_NOT_FOUND"
));
response.setStatus(HttpStatus.OK.value());
response.setContentType("application/json");
objectMapper.writeValue(response.getWriter(), error);
}
}
And the fuzzy matching implementation is simple—I use the Levenshtein distance algorithm:
public class FuzzyMatcher {
public List<MatchResult> findBestMatches(String input, List<ToolDefinition> tools) {
return tools.stream()
.map(tool -> {
int distance = levenshteinDistance(input.toLowerCase(), tool.getName().toLowerCase());
double score = 1.0 - (double) distance / Math.max(input.length(), tool.getName().length());
return new MatchResult(tool, score);
})
.filter(mr -> mr.getScore() > 0.5)
.sorted((a, b) -> Double.compare(b.getScore(), a.getScore()))
.collect(Collectors.toList());
}
private int levenshteinDistance(String s1, String s2) {
int[][] dp = new int[s1.length() + 1][s2.length() + 1];
for (int i = 0; i <= s1.length(); i++) {
dp[i][0] = i;
}
for (int j = 0; j <= s2.length(); j++) {
dp[0][j] = j;
}
for (int i = 1; i <= s1.length(); i++) {
for (int j = 1; j <= s2.length(); j++) {
int cost = (s1.charAt(i - 1) == s2.charAt(j - 1)) ? 0 : 1;
dp[i][j] = Math.min(
Math.min(dp[i - 1][j] + 1, dp[i][j - 1] + 1),
dp[i - 1][j - 1] + cost
);
}
}
return dp[s1.length()][s2.length()];
}
}
public record MatchResult(ToolDefinition tool, double score) {}
After deploying this, I checked my logs. Here's what happened:
That's massive. Most of the hallucinated tool names are close, just pluralization wrong or slight spelling variation. The fuzzy matching just fixes them automatically—user never even knows.
When it can't fix it automatically, it gives the LLM suggestions: "Did you mean one of these?" That's way better than just "tool not found"—the LLM can correct itself instead of giving up.
I used to write tool descriptions like this:
{
"name": "search_papers",
"description": "Searches papers"
}
Useless. Now I write:
{
"name": "search_papers",
"description": "Search my personal knowledge base for papers and notes by semantic similarity. Use this when you need to find information I've previously saved that's relevant to the current conversation. Returns the most relevant papers with their content. Parameters: query (string) - the search query based on the current context."
}
That's 5x longer, but it tells the LLM:
Since I improved descriptions, I've seen about 30% fewer cases where the AI chooses the wrong tool intentionally. It just knows better now.
It's not just tool names—parameters get hallucinated too. I added parameter aliasing:
public class ParameterAliasResolver {
private final Map<String, String> aliases = new HashMap<>();
public ParameterAliasResolver() {
// Common aliases for common parameter names
aliases.put("question", "query");
aliases.put("search", "query");
aliases.put("search_term", "query");
aliases.put("id", "paper_id");
aliases.put("paperId", "paper_id");
aliases.put("key", "api_key");
aliases.put("token", "api_key");
}
public void addAlias(String alias, String correctName) {
aliases.put(alias.toLowerCase(), correctName);
}
public String resolve(String paramName) {
return aliases.getOrDefault(paramName.toLowerCase(), paramName);
}
public void resolveAll(Map<String, Object> params) {
List<String> keysToRemove = new ArrayList<>();
Map<String, Object> toAdd = new HashMap<>();
for (Map.Entry<String, Object> entry : params.entrySet()) {
String resolved = resolve(entry.getKey());
if (!resolved.equals(entry.getKey())) {
keysToRemove.add(entry.getKey());
toAdd.put(resolved, entry.getValue());
}
}
keysToRemove.forEach(params::remove);
params.putAll(toAdd);
}
}
Then in your tool call handler:
public CallToolResult callTool(String name, Map<String, Object> args) {
parameterAliasResolver.resolveAll(args);
// proceed with correct parameter names
}
This fixes another 10-15% of errors. Common parameter name variations just work automatically.
I ended up building a proper tools registry that holds all the metadata in one place:
@Configuration
public class ToolRegistryConfiguration {
@Bean
public ToolRegistry toolRegistry(List<McpTool> tools) {
ToolRegistry registry = new ToolRegistry();
tools.forEach(registry::register);
return registry;
}
}
public interface McpTool {
ToolDefinition getDefinition();
Object call(Map<String, Object> args) throws ToolException;
}
public class ToolRegistry {
private final Map<String, McpTool> tools = new HashMap<>();
private final ParameterAliasResolver aliasResolver = new ParameterAliasResolver();
public void register(McpTool tool) {
ToolDefinition def = tool.getDefinition();
tools.put(def.getName(), tool);
// Register aliases from the definition if provided
if (def.getAliases() != null) {
for (String alias : def.getAliases()) {
aliasResolver.addAlias(alias, def.getName());
}
}
}
// ... fuzzy matching logic here
}
This keeps everything organized. Each tool just implements the interface, gets automatically registered, and discovery just works.
Let's be honest—this adds complexity. Is it worth it?
Here's the thing—MCP is a new protocol. Everybody's figuring it out as we go. The spec covers the basics, but it doesn't cover all the edge cases that happen in production.
I used to think: "The LLM should know better. It should read the tool list correctly."
But that's not how LLMs work. They predict the next token—they don't "read" like humans. They guess based on patterns. And "search paper" → search_paper
is a totally reasonable guess from their perspective.
Instead of fighting the LLM, work with it. Add a little fuzzy discovery, and most of your problems just go away.
Another thing I learned: graceful degradation is everything. It's better to fix the mistake automatically than to fail and make the user fix it. The user didn't do anything wrong—the LLM hallucinated. Why should the user pay the price?
Let me leave you with the actual numbers from my server after two weeks:
| Metric | Before Fuzzy Discovery | After Fuzzy Discovery |
|---|---|---|
| Tool Not Found Errors/day | 14 | 2 |
| Successful Tool Calls/day | 287 | 301 |
| Success Rate | 95.1% | 99.3% |
| User Complaints about "it doesn't work" | 2-3/week | 0 |
That's the difference between "this is flaky" and "this just works".
Have you built an MCP server? Have you noticed the LLM hallucinating tool names or parameter names? I'd love to hear—am I the only one who's run into this? Do you have a different solution?
Drop a comment below and let me know what your experience has been with MCP discovery. What other hidden production gotchas have you found?