{"slug": "your-llm-is-wrong-your-codebase-is-why", "title": "Your LLM Is Wrong. Your Codebase Is Why.", "summary": "A developer discovered that their AI coding assistant was generating incorrect information about their own codebase, not due to model failure but because of \"comprehension debt\" in the code. The assistant hallucinated function names, misidentified parameter types, imported unused packages, referenced deprecated patterns, and missed business rules—all because the source code lacked clear intent, documentation, and consistent naming conventions. The developer argues that these AI errors serve as a free audit, revealing gaps in code clarity that affect both human and machine readers.", "body_md": "It happened on a Tuesday. I asked my AI coding assistant to explain a function I'd written three months earlier. It described a function that doesn't exist.\n\nNot a total hallucination. The function *did* exist. Just not by that name, not with those parameters, not doing what the model confidently told me it was doing. The model had assembled a plausible story from vague signals and filled the gaps with fiction.\n\nMy first instinct was to blame the model. My second instinct, the one that actually helped, was to look at the code itself.\n\nThe model wasn't broken. My codebase was.\n\nTechnical debt is code that's hard to change. Comprehension debt is code that's hard to *understand*. Not just by future developers. By anything that has to read it cold: a new hire, a rubber duck, and increasingly, an AI assistant.\n\nYou've probably heard \"write code as if the next maintainer is a serial killer who knows where you live.\" The LLM version is more forgiving. But not by much.\n\nComprehension debt shows up when the intent of your code isn't captured *in* your code. The logic works. The tests pass. But nothing in the source tells a reader *why* a function does what it does, what its constraints are, or what it absolutely should not do. That knowledge lives in someone's head, in a Slack thread from two months ago, or nowhere at all.\n\nLLMs don't have access to the Slack thread. They only have your source.\n\nWhen your AI assistant gets your own codebase wrong, it's not random. The errors cluster around specific failure modes, and each one points to a real gap.\n\n**1. It invents function names.**\n\nThe model calls functions that don't exist, or calls existing functions by the wrong name. This usually means your naming is inconsistent or your barrel exports are incomplete. The model is pattern matching across conventions that don't agree with each other.\n\n**2. It gets parameter types wrong.**\n\nIt passes a string where you want a typed enum, or a plain object where you've defined a specific interface. This almost always means missing or implicit type annotations in your function signatures. The model is guessing.\n\n**3. It imports packages you don't use.**\n\nIt reaches for `lodash`\n\nor `axios`\n\nwhen you've got utility wrappers that wrap those already. Your actual internal abstractions aren't legible to the model because they aren't documented anywhere they can be found. The model falls back to what it knows from training.\n\n**4. It uses patterns you've deprecated.**\n\nIt calls the old version of your API, the one you stopped using eight months ago. Your codebase still *contains* those old patterns (maybe for backward compatibility, maybe just because cleanup hasn't happened yet) and the model doesn't know which version is current. Deprecation comments cost thirty seconds to write. Their absence costs you five minutes of confusion per assistant interaction.\n\n**5. It doesn't know the business rule.**\n\nIt gives you the technically valid version of a function, not the version that accounts for the actual constraint. \"This user lookup should always check the soft delete flag first\" lives in a comment in no file. It was decided in a call. The model can't know what was never written down.\n\nEach of these errors is a free audit item. You didn't have to run a tool to find it. The model found it for you.\n\nYou don't need a formal process for this. You just need to treat your LLM's confusion as a signal instead of noise.\n\nPick a module. Any module that's been around for more than a few months. Feed it to your AI assistant and ask these questions:\n\nDon't correct the model when it gets something wrong. Write down what it got wrong. That list is your comprehension debt register.\n\nFor a healthy module, the model will get most of this right. For a module with comprehension debt, you'll see the five signals show up fast.\n\nI ran this on an internal TypeScript service last quarter. Twelve exported functions. The model hallucinated the names of three of them, got the return type wrong on two others, and had no idea what the rate limit parameter was for. That's a 41% wrong answer rate on a module I thought was well maintained. It wasn't. It just worked.\n\nWorking and legible are not the same thing.\n\nThe instinct is to reach for RAG (chunk your codebase, embed it, retrieve relevant context before each LLM call). That helps. I cover the full approach in [my production RAG guide](https://mudassirkhan.me/blog/production-rag-guide-2026) if you want the implementation details.\n\nBut RAG retrieves your documentation. If your documentation is the code itself and the code is opaque, RAG gives the model better access to opaque code. The underlying problem doesn't change.\n\nThe actual fix is cheaper than you think:\n\n**Write the intent, not the implementation.** A JSDoc comment that says \"Validates and normalizes a user object. Always call this before persisting to the database. Does NOT check permissions.\" gives the model something to retrieve. A comment that says \"validates user\" does not.\n\n**Mark your deprecations inline.** `@deprecated Use getUserV2 instead`\n\ntakes five seconds. It means the model stops confidently recommending the old API.\n\n**Put your business rules in the file that enforces them.** Not in the ticket. Not in Confluence. In the file. A comment above the rate limit parameter that says \"this is hardcoded per the billing agreement with enterprise customers, do not make it configurable\" is documentation that actually travels with the code.\n\nThe goal isn't to write documentation for humans. It's to write documentation that your LLM assistant can parse so it can help you correctly. The secondary effect is that it also helps the next human on your team. That's free.\n\nFor teams working on larger AI agent systems, the memory and context patterns that help here are the same ones I break down in [my post on AI agent memory management](https://mudassirkhan.me/blog/ai-agent-memory-management). Comprehension debt in your codebase and context gaps in your agents come from the same root cause: undocumented intent.\n\nYou can also get a quick read on your current exposure with [this LLM hallucination risk estimator](https://mudassirkhan.me/tools/llm-hallucination-risk-estimator). It won't diagnose specific debt, but it gives you a calibrated starting point for where to focus.\n\nYour LLM assistant is, right now, the most honest reader your codebase has. It doesn't know the context you carry in your head. It doesn't remember the decision you made in 2024. It reads what's there and tries to make sense of it.\n\nWhen it gets something wrong, that's signal. The model isn't failing. It's showing you exactly what a reader without your context has to work with.\n\nThat's a gift. Most code never gets that kind of external read until the next engineer joins and asks the same confused questions.\n\nUse it.\n\n*If you want this kind of thinking applied to your actual codebase or AI systems architecture, that is exactly the kind of work I take on.*\n\n*If you want a deeper look at production AI systems, I cover it on mudassirkhan.me.*\n\n*How wrong does your LLM get your own codebase? Drop a number in the comments. Curious what percentage of wrong answers people are seeing in production.*", "url": "https://wpnews.pro/news/your-llm-is-wrong-your-codebase-is-why", "canonical_source": "https://dev.to/mudassirworks/your-llm-is-wrong-your-codebase-is-why-1jjp", "published_at": "2026-05-26 21:53:18+00:00", "updated_at": "2026-05-26 22:02:57.563784+00:00", "lang": "en", "topics": ["large-language-models", "ai-tools", "ai-products", "ai-research", "ai-ethics"], "entities": [], "alternates": {"html": "https://wpnews.pro/news/your-llm-is-wrong-your-codebase-is-why", "markdown": "https://wpnews.pro/news/your-llm-is-wrong-your-codebase-is-why.md", "text": "https://wpnews.pro/news/your-llm-is-wrong-your-codebase-is-why.txt", "jsonld": "https://wpnews.pro/news/your-llm-is-wrong-your-codebase-is-why.jsonld"}}