cd /news/ai-tools/neurodoc-from-broken-prototype-to-pr… Β· home β€Ί topics β€Ί ai-tools β€Ί article
[ARTICLE Β· art-18367] src=dev.to pub= topic=ai-tools verified=true sentiment=↑ positive

🧠 NeuroDoc: From Broken Prototype to Production-Ready Async AI Documentation Engine

NeuroDoc, an AI-powered documentation engine, was rebuilt from a fragile CLI prototype into a production-ready full-stack web dashboard with RAG capabilities. The original tool suffered from blocking synchronous loops, an in-memory task queue that lost all jobs on crash, and brittle core resolvers, but a complete architectural rewrite using asyncio and aiohttp eliminated these flaws. The new version features a persistent database-backed task queue and a RAG pipeline that embeds queries, retrieves relevant documentation chunks, and generates grounded summaries.

read7 min publishedMay 30, 2026

This is a submission for the GitHub Finish-Up-A-Thon Challenge

I abandoned this project. Then I resurrected it. Here's how a fragile CLI script became a full-stack async web dashboard with RAG capabilities.

NeuroDoc started as an ambitious idea: a single tool to fetch, scrape, process, and summarize documentation across Python, scikit-learn, PyTorch, and TensorFlow β€” powered by NLP and multi-core processing.

But it hit a wall fast.

while True:
    query = input("Enter query: ")  # 🚫 BLOCKS the main thread
    result = fetch_docs(query)      # 🚫 BLOCKS background workers
    print(result)

The original prototype had three fatal flaws:

Problem Impact
input() loop on main thread
Blocked all background scraping workers
In-memory task queue All pending jobs vanished on crash
Brittle core resolver Failed silently on dynamic imports

Long-running doc crawls would stall. A single crash wiped the entire task queue. It was a house of cards β€” impressive from a distance, terrifying up close.

So I shelved it.

Months later, I came back with a clear head and a plan. The rewrite wasn't incremental β€” it was architectural. Three shifts made everything click:

asyncio

  • aiohttp

Out went the blocking loop. In came a proper async event loop that lets scraping, processing, and serving happen concurrently without stepping on each other.

async def fetch_documentation(url: str, session: aiohttp.ClientSession) -> DocResult:
    async with session.get(url, timeout=aiohttp.ClientTimeout(total=30)) as response:
        content = await response.text()
        return await process_content(content)

async def run_pipeline(queries: list[str]) -> list[DocResult]:
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_documentation(q, session) for q in queries]
        return await asyncio.gather(*tasks, return_exceptions=True)

No more frozen terminals. No more stalled workers.

The in-memory queue was replaced with a persistent, database-backed task queue. Now if the server crashes at 3 AM while crawling PyTorch docs, no work is lost. Tasks resume exactly where they left off.

class TaskQueue:
    async def enqueue(self, task: DocumentationTask) -> str:
        task_id = str(uuid.uuid4())
        await self.db.execute(
            "INSERT INTO tasks (id, status, payload, created_at) VALUES (?, ?, ?, ?)",
            (task_id, TaskStatus.PENDING, task.to_json(), datetime.utcnow())
        )
        return task_id

    async def get_next(self) -> DocumentationTask | None:
        row = await self.db.fetchone(
            "SELECT * FROM tasks WHERE status = 'pending' ORDER BY created_at LIMIT 1"
        )
        return DocumentationTask.from_row(row) if row else None

This is where NeuroDoc levels up from "scraper" to "intelligent documentation assistant."

Instead of returning raw docs, it:

class RAGPipeline:
    async def query(self, user_query: str) -> RAGResponse:
        query_embedding = await self.embedder.embed(user_query)

        relevant_chunks = await self.vector_store.similarity_search(
            query_embedding, top_k=5
        )

        context = "\n\n".join(chunk.text for chunk in relevant_chunks)
        summary = await self.llm.generate(
            prompt=f"Answer based on this documentation:\n{context}\n\nQuery: {user_query}"
        )

        return RAGResponse(summary=summary, sources=relevant_chunks)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Web Dashboard (FastAPI)            β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚              β”‚  Submit  β”‚   Results    β”‚            β”‚
β”‚              β”‚  Query   β”‚   Viewer     β”‚            β”‚
β”‚              β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚            β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚       Async Task Dispatcher      β”‚
          β”‚    (asyncio + DB task queue)     β”‚
          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚                  β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Multi-core     β”‚    β”‚   RAG Pipeline       β”‚
    β”‚  Doc Scraper    β”‚    β”‚  (Embed β†’ Retrieve   β”‚
    β”‚  (aiohttp)      β”‚    β”‚   β†’ Generate)        β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                      β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚           SQLite / PostgreSQL DB             β”‚
    β”‚   (tasks Β· chunks Β· embeddings Β· results)    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Library Sections Scraped NLP Processing
🐍 Python
stdlib, builtins, language ref Code extraction, summaries
πŸ€– scikit-learn
API reference, user guide Table parsing, param docs
πŸ”₯ PyTorch
Tensor ops, nn, autograd Code snippets, examples
🌊 TensorFlow
Keras, tf.data, layers API signatures, guides
git clone https://github.com/kaushikcoderpy1/neurodoc
cd neurodoc

pip install -r requirements.txt

python -m neurodoc.db init

uvicorn neurodoc.app:app --reload --port 8000

Then open http://localhost:8000

and start querying.

Why asyncio over threading?

asyncio

handles thousands of concurrent requests with a single thread β€” no GIL fights, no race conditions.Why SQLite for the task queue instead of Redis?

Zero infrastructure. NeuroDoc is a dev tool β€” adding a Redis dependency just to persist a queue adds friction. SQLite WAL mode handles concurrent reads/writes cleanly for this use case.

Why RAG over fine-tuning?

Documentation changes constantly. RAG retrieves from live-scraped content. A fine-tuned model would be stale in weeks.

This section is the heart of the comeback story. NeuroDoc didn't just get rewritten β€” it got

debugged at a deep architectural levelwith Copilot as a true pair programmer. Here are four real, production-blocking bugs it helped resolve.

The failure: Under high-concurrency loads via asyncio.gather

, edge-case exceptions inside sub-coroutines bypassed connection release hooks β€” leaving asyncpg

pool sockets exhausted and the app hanging silently.

Standard try/finally

cleanup blocks failed because they referenced stale async contexts. The pool hit max capacity and froze.

How Copilot helped:

Copilot introduced a strict connection acquisition pattern bound directly to local transaction lifecycles, with absolute timeout guards:

async with pool.acquire() as connection:
    async with connection.transaction():
        result = await asyncio.wait_for(
            connection.fetch(query, *args),
            timeout=5.0  # Hard boundary β€” no silent hangs
        )

It also added global exception wrappers that translate raw driver errors into clean structured responses β€” guaranteeing connection cleanup even if the downstream scraping pipeline crashed.

SpecifierSet .contains()

AttributeError Across Packaging Versions The failure: formatter.py

runs dependency diagnostics via DependencyAnalyzer

. On environments with older packaging

library versions, calling .contains()

on a SpecifierSet

threw:

AttributeError: 'SpecifierSet' object has no attribute 'contains'

This crashed the entire diagnostic panel before it could render β€” silently breaking environment validation for a large chunk of users.

How Copilot helped:

Copilot identified that .contains()

is version-specific, but the native in

operator is universally backward-compatible across all historical releases of packaging

:

elif not raw_spec.contains(local):

elif local not in raw_spec:

One operator swap. Zero crashes across all environments.

The failure: In neurodoc.py

, CLI input like neurodoc fetch os

passed the core ID "1"

as a raw string into isinstance(core, Core1PythonBasics)

checks. Since "1"

is a string, every check silently fell through with:

Unknown core type for str

Worse β€” the topic "os"

was passed into the batch resolver without list wrapping, so it iterated over the characters 'o'

and 's'

separately instead of treating "os"

as a unified module name.

How Copilot helped:

Copilot introduced dynamic string dereferencing that maps string IDs back to their live handler instances, plus list-wrapping for topic encapsulation:

if isinstance(core, str):
    core = self.command_handler.available_cores.get(core)

return await self.call_backend("core1", topics=[topic_f], flags=flags)

The failure: nlp_with_cos.py

calculates semantic similarity across documentation topics using PyTorch/TensorFlow models. Queries of varying lengths produced tensors with mismatched dimensions, throwing:

RuntimeError: Tensors must be of the same shape

This crashed deep multi-core fetches completely β€” the most expensive operation in the entire pipeline.

How Copilot helped:

Copilot suggested a preprocessing step using dynamic zero-padding and truncation to align all input vectors before the cosine similarity matrix calculation:

inputs = tokenizer(
    text,
    padding="max_length",
    truncation=True,
    max_length=512,
    return_tensors="pt"
)

All tensors now enter the similarity layer at identical dimensions β€” no shape mismatches, no crashes.

These weren't simple autocomplete suggestions. Copilot reasoned about async lifecycle boundaries, cross-version API compatibility, type system edge cases, and linear algebra constraints β€” the kind of bugs that take hours of debugging to even locate, let alone fix.

The biggest unlock: it didn't just fix the symptom. For each bug, it explained why the original approach was fragile and offered a pattern that would hold up under production conditions.

That's the difference between a tool and a collaborator.

Built for the DEV.to hackathon. Powered by stubbornness, async Python, and too much coffee.

── more in #ai-tools 4 stories Β· sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/neurodoc-from-broken…] indexed:0 read:7min 2026-05-30 Β· β€”