{"slug": "i-built-a-docs-q-a-engine-that-returns-null-instead-of-hallucinating", "title": "I built a docs Q&A engine that returns null instead of hallucinating", "summary": "An engineer built a documentation Q&A engine that returns `null` instead of hallucinating answers when a query has no match in the corpus. The Knowledge Base API uses BM25 retrieval with POS-aware lemmatization and WordNet synonym expansion, requiring no language models, API keys, or external data transfers. The system also handles identifier-heavy queries by splitting on underscores, hyphens, and CamelCase boundaries, and includes a BK-tree for typo-tolerant matching.", "body_md": "Every \"docs chatbot\" today routes user questions through OpenAI. For\n\nopen-source maintainers, privacy-conscious teams, and air-gapped\n\nenvironments, that's either too expensive or unacceptable. So I built\n\none that doesn't.\n\n[Knowledge Base API](https://github.com/teamerisingstars/KB-API) is a\n\nsmall FastAPI service that answers questions over a folder of markdown\n\nfiles using **BM25 + POS-aware lemmatization + WordNet synonym\nexpansion**. No models. No API keys. No data leaving the box.\n\n[Live demo against FastAPI + Pydantic + Starlette docs](https://kb-api-q30f.onrender.com)\n\n(2,869 sections, 265 files).\n\nThe single hardest behaviour to enforce was making the API return\n\n`null`\n\ninstead of inventing an answer when nothing in the corpus is\n\na real fit.\n\n```\ncurl -X POST https://kb-api-q30f.onrender.com/ask \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"question\":\"what is quantum chromodynamics\"}'\n{\n  \"answer\": null,\n  \"section\": null,\n  \"source\": null,\n  \"confidence\": 0.0,\n  \"message\": \"I don't have enough information to answer that.\"\n}\n```\n\nMost retrieval systems silently return the least-bad section. The\n\ntrade-off — sometimes refusing to answer — is the whole point.\n\nThe default NLTK tokenizer keeps `response_model`\n\n,\n\n`OAuth2PasswordBearer`\n\n, and `Cross-Origin`\n\nas single opaque tokens.\n\nThat means a query for \"what is response_model\" never matches because\n\nthe document body has `response_model`\n\nunderscored and the lemmatized\n\nquery doesn't.\n\nSolution: split on `_`\n\n, `-`\n\n, and CamelCase boundaries before\n\nlemmatization, and keep BOTH the full identifier and its pieces in the\n\nindexed token stream.\n\n``` php\nsplit_identifier(\"OAuth2PasswordBearer\")\n# -> [\"OAuth2PasswordBearer\", \"OAuth2\", \"Password\", \"Bearer\"]\n\nsplit_identifier(\"Cross-Origin\")\n# -> [\"Cross-Origin\", \"Cross\", \"Origin\"]\n```\n\nGoing from 50% to 90% accuracy on identifier-heavy queries was almost\n\nentirely this fix.\n\nIf you expand `CORS`\n\nto `cross origin resource sharing`\n\nat index time,\n\nevery BM25 IDF calculation breaks — terms appear artificially often,\n\ndocument lengths inflate, scoring degrades.\n\nThe right move is **query-side only**:\n\n```\n_ACRONYMS = {\n    \"cors\": \"cross origin resource sharing\",\n    \"jwt\":  \"json web token\",\n    \"api\":  \"application programming interface\",\n    \"csrf\": \"cross site request forgery\",\n    \"xss\":  \"cross site scripting\",\n    \"orm\":  \"object relational mapping\",\n    # ...\n}\n```\n\nWhen the query contains an acronym, append the expansion tokens to\n\nthe query. The index stays pure.\n\nPure BM25 over docs returns weird results because:\n\n`reference/foo.md`\n\nare canonical definitions; tutorials\nare examplesSo the score gets four passes:\n\nraw_bm25_score(query)\n\n× HEADING_BOOST_FACTOR if heading-query overlap ≥ 50%\n\n1.0 if heading EXACTLY matches query subject\n\n× FILENAME_BOOST_FACTOR if filename overlaps query\n\n× REFERENCE_PATH_BOOST if path is under reference/\n\nAnd below a hard threshold, the result is rejected entirely:\n\n```\nif not scores.size or scores.max() < CONFIDENCE_THRESHOLD:\n    return _no_match()\n```\n\nThat last line is the difference between \"honestly returns null\"\n\nand \"silently returns the least-bad section.\"\n\nA few hours after launching on Reddit, a commenter asked: \"what\n\nabout searching 'cross origin' for CORS, or what about typos like\n\n'rsponse_model'?\"\n\nThe first case worked fine — BM25 finds the CORS docs because the\n\nbody contains \"Cross-Origin Resource Sharing\" verbatim. But typos?\n\nTotal miss. \"rsponse_model\" returned a wrong answer at 0.34\n\nconfidence — confidently wrong, above the threshold, no warning to\n\nthe user.\n\nThat's the worst possible failure mode for a \"honest null\" product:\n\nthe no-fabrication promise breaks for typo'd in-corpus queries,\n\nwhich is arguably the more common failure mode than out-of-corpus\n\nqueries.\n\nFix shipped same day: a BK-tree (Burkhard-Keller tree) over the\n\nindexed vocabulary at index time, with query-time nearest-neighbour\n\nlookup using length-tuned edit distance:\n\n``` python\ndef fuzzy_candidates(tree, token):\n    if len(token) <= 8:\n        max_dist = 1   # short words: ambiguous beyond one edit\n    else:\n        max_dist = 2   # OAuth2PasswordBearer can tolerate more slop\n    return [w for w, d in tree.search(token, max_dist) if d > 0]\n```\n\nWhen fuzzy correction fires, the confidence is capped at 0.6 and the\n\nresponse includes a \"verify the source\" message so the caller knows\n\nthe answer came from a corrected query, not an exact match.\n\nPlus a guard against fuzzy-correcting nonsense queries: if 3+ user\n\ntokens are unrecognized, return null. \"Quantum chromodynamics\n\nneutrino flux\" against FastAPI docs correctly stays null even though\n\nfuzzy lookup could find nearest-neighbour matches for each individual\n\nword.\n\n| Query | Result | Notes |\n|---|---|---|\n`what is response_model` |\n`response_model Priority` |\n1.0 confidence |\n`how do I add CORS` |\n`CORS (Cross-Origin Resource Sharing)` |\n1.0 confidence |\n`what is OAuth2PasswordBearer` |\n`FastAPI's OAuth2PasswordBearer` |\n1.0 confidence |\n`what is APIRouter` |\n`APIRouter class` (in reference/apirouter.md) |\n1.0 confidence |\n`what is rsponse_model` (typo) |\n`response_model Priority` |\n0.6 confidence + warning |\n`how do I add corss` (typo) |\n`CORS preflight requests` |\n0.46 confidence + warning |\n`what is quantum chromodynamics` |\n`null` |\nhonest refusal |\n\n`answer`\n\nfield is the matching section's body\nverbatim, not a paraphrase. If you want a summary, use a different\ntool.`null`\n\n. That's the feature.| Layer | Choice | Why |\n|---|---|---|\n| Web | FastAPI + Uvicorn | Async, typed, batteries-included |\n| Ranking | rank-bm25 | Reference Okapi BM25 implementation |\n| NLP | NLTK | WordNet, Penn Treebank tagger, stopwords — boring and reliable |\n| Fuzzy | Custom BK-tree | ~150 lines, no dependency |\n| Parser | markdown-it-py | Handles fenced code blocks correctly |\n| File watch | watchdog | Cross-platform file events |\n\nTotal app code: ~700 lines. Image size: ~250 MB. RAM at runtime:\n\n~40 MB. Indexes 1,800 markdown sections in well under a second.\n\n[github.com/teamerisingstars/KB-API](https://github.com/teamerisingstars/KB-API)\n\nLive demo: [kb-api-q30f.onrender.com](https://kb-api-q30f.onrender.com)\n\nIf you've built something similar or have thoughts on the BM25\n\ntuning, the fuzzy correction, or the boost stack, I'd genuinely like\n\nto hear what would change. Drop a comment or open an issue.", "url": "https://wpnews.pro/news/i-built-a-docs-q-a-engine-that-returns-null-instead-of-hallucinating", "canonical_source": "https://dev.to/sujithkrishnanpk_c9f931/i-built-a-docs-qa-engine-that-returns-null-instead-of-hallucinating-58p6", "published_at": "2026-05-29 09:36:01+00:00", "updated_at": "2026-05-29 09:41:38.979730+00:00", "lang": "en", "topics": ["natural-language-processing", "ai-tools", "ai-products", "ai-infrastructure", "ai-ethics"], "entities": ["Knowledge Base API", "FastAPI", "NLTK", "OpenAI", "WordNet", "BM25", "Pydantic", "Starlette"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-docs-q-a-engine-that-returns-null-instead-of-hallucinating", "markdown": "https://wpnews.pro/news/i-built-a-docs-q-a-engine-that-returns-null-instead-of-hallucinating.md", "text": "https://wpnews.pro/news/i-built-a-docs-q-a-engine-that-returns-null-instead-of-hallucinating.txt", "jsonld": "https://wpnews.pro/news/i-built-a-docs-q-a-engine-that-returns-null-instead-of-hallucinating.jsonld"}}