Building AshaPulse — An AI-Powered Health Assistant for India's Frontline Warriors

wpnews.pro

#

NiDaan: Building an Offline AI Diagnostic Assistant for Rural Health Workers in India

Building AI that works without internet in places where it matters most

#

Introduction

In rural India, a child with a fever isn't just a medical concern — it's a race against time. ASHA workers (Accredited Social Health Activists) are often the first and sometimes only line of healthcare for 1000+ patients each. They carry a limited medicine kit, have basic training, and no access to instant medical consultation.

I'm Priyanshu, a final-year computer science student from West Bengal. In May 2025, I started building NiDaan — an AI diagnostic assistant designed specifically for these health workers. No internet required. No expensive infrastructure. Just a laptop and a phone.

This is the story of why I built it, what I learned, and how you can adapt this approach for underserved communities anywhere.

#

The Problem: Healthcare in Absence

Why This Matters

According to India's health ministry data:

70% of Indians live in rural areas 1 ASHA worker serves 1000+ people Average PHC (Primary Health Centre) is 10-15km away Most areas have unreliable internet connectivity

ASHA workers are trained, dedicated, but isolated from medical expertise. When a mother brings a child with symptoms, the ASHA worker must decide: home treatment or PHC referral?

Get it wrong and:

Delay in serious cases = life-threatening complications
Over-referral = wasted resources, patient burden, loss of trust
Lack of structured guidance = inconsistent treatment

The Traditional Solution Doesn't Work

Existing diagnostic apps:

Require constant internet (unavailable in rural areas)
Built for urban/English-speaking users
Heavy UI, poor offline support
No integration with local drug availability
Don't follow MOHFW (Ministry of Health & Family Welfare) guidelines

I needed something different.

#

The Solution: NiDaan

What is NiDaan?

NiDaan (Hindi for "diagnosis") is an offline-capable AI diagnostic assistant that:

Accepts symptoms in Hindi/Hinglish — "bacche ko bukhaar hai, khaana nahi kha raha" #

Retrieves relevant medical knowledge from official MOHFW guidelines #

Classifies severity into low/medium/high with structured reasoning #

Recommends PHC referral or home care with specific medicines from ASHA drug kit #

Provides advice in simple Hindi for patient/family communication

Key principle: The system synthesizes, it doesn't invent. All recommendations come from retrieved medical guidelines, not hallucinated knowledge.

The Name & Tagline

NiDaan won an internal naming competition over "ChatGPT for ASHA workers."

Tagline: "Sahi waqt par, sahi salah" — Right advice, at the right time.

#

Architecture: Local Network, Zero Internet

Why this architecture?

Android on-device LLMs were RAM-constrained (16GB laptop available, phones have 2-4GB)
Web-based frontend works on any phone/tablet
Central backend handles heavy lifting
Zero internet in production (uses Ollama), flexible for testing (Groq/NIM)

#

Tech Stack

Key decision: Swappable LLM infrastructure. Changing 1 line switches between Groq → NIM → Ollama.

#

Data Collection & Knowledge Base

Medical Documents Ingested

How We Built the Knowledge Base

Downloaded PDFs from official MOHFW website (Ministry of Health & Family Welfare) #

Parsed with PyMuPDF — extracted text, maintained metadata #

Chunked intelligently — 1000 chars per chunk, 200 char overlap #

Embedded with `all-MiniLM-L6-v2` — 80MB, handles English + Hindi/Hinglish #

Stored in ChromaDB — persistent vector database on disk

PHC Directory System

Built a district-level PHC database with 19 verified Primary Health Centers across 5 West Bengal districts:

Used haversine distance formula for proximity-based referral (not implemented in V1, but architecture ready for Phase 2).

#

Challenges Faced

Challenge 1: Response Latency

Problem: NVIDIA NIM responses took 45-70 seconds.

Why it mattered: In a medical consultation, a health worker expects near-instant feedback. Long waits erode trust.

Solutions tried:

Switched to Groq (llama-3.1-8b-instant) → 12 seconds ✅
Reduced retrieval from k=5 to k=2 chunks
Limited max_tokens from 4096 to 2048

Lesson: Speed ≠ quality. Groq's smaller model is fast but sometimes less clinically precise. NIM is better but slow. For production with health workers, I'd recommend Groq + aggressive prompt optimization.

Challenge 2: Memory Constraints on Railway

Problem: Deployed on Railway (free tier: 512MB RAM). App crashed with "out of memory."

Root cause:

- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 (500MB alone)
- ChromaDB (~50MB)
- FastAPI + LangChain (~150MB)
**Total: ~700MB > 512MB limit**

Solutions:

Switched embedding model to

`all-MiniLM-L6-v2`

(80MB) ✅

Rebuilt ChromaDB with lightweight embeddings

- Committed ChromaDB to GitHub (ephemeral filesystem issue)
- Reduced k=5 → k=3 retrievals

Trade-off: Lost Hinglish-specific embedding quality but gained Railway compatibility.

Lesson: In constrained environments, simpler models often outperform fancy ones. English embeddings work fine for medical terminology (universal across languages).

Challenge 3: Image Assets Broken in Deployment

Problem: React logos working locally (/src/assets/Nidaan.png

) broke on deployment.

Why: Vite dev server serves /src/

directly. Production doesn't.

Solution: Moved assets to public/

folder, changed path to /Nidaan.png

.

Lesson: Always test deployment paths locally. Static file serving is environment-specific.

Challenge 4: RAG Retrieval Quality

Problem: Querying "postpartum bleeding" returned irrelevant chunks (contributor lists, title pages).

Why: PDF front matter wasn't filtered; chunking strategy naïve.

Solutions implemented:

Increased chunk size to capture more context
Added metadata filtering (skip pages 1-3 of each PDF)
Improved prompt to weight clinical terms higher

Still pending: Better chunking strategy, page-level filtering during ingest.

Lesson: RAG quality depends 70% on retrieval, 30% on LLM. Garbage in = garbage out, no matter how good the LLM.

Challenge 5: Prompt Instability Across LLMs

Problem: Same prompt behaved differently on Groq vs NIM vs Ollama.

Groq over-generalized criticality (fever = MEDIUM too often)
NIM took too long

- Ollama (R1:7b) was excellent but 2-5 min per response

**Solution:** Built LLM-agnostic prompt with:

Explicit decision trees (HIGH → MEDIUM → LOW, stop at first match)
Medicine lookup tables (model scans and picks, no inference)
Concrete examples for every severity level
Danger sign normalization (Hindi terms → clinical terms) Result: 95%+ consistency across all three LLMs.

Lesson: For safety-critical domains (medical), explicit structured prompts beat few-shot learning. Give the model rules, not vibes.

Challenge 6: Hinglish Support Without Compromising Speed

Problem: Multilingual embeddings were heavy (500MB). English-only were fast but lost Hinglish nuance.

**Solution:** `all-MiniLM-L6-v2`

(80MB, English-optimized but still works for Hinglish because):

Medical PDFs are English
User input is Hinglish/Hindi
LLM (Groq) understands Hinglish natively
Embeddings just need to match terms to docs, not understand nuance

Trade-off: Retrieval quality dropped ~5-10% but acceptable for medical context (symptoms are universal).

Lesson: Don't over-engineer embedding models. For domain-specific RAG, a smaller model + good prompt beats a heavyweight multilingual one.

#

Solutions & Lessons Learned

What Worked

LLM abstraction layer — One MODE

variable switches between 3 different LLMs without changing chain logic #

Pydantic schemas — Enforced strict output structure; prevented hallucinations #

Decision tree prompting — Explicit IF/THEN rules beat complex reasoning for medical safety #

Offline-first architecture — Demo works without internet; deployment flexibility #

RAG over fine-tuning — Faster iteration, no retraining needed

What Didn't

Over-engineered embedding models — Multilingual models added complexity without proportional benefit #

Cloud-first assumptions — Didn't account for ephemeral filesystems on Railway #

Generic RAG retrieval — No filtering for PDF front matter led to irrelevant chunks #

Prompt optimism — Expected one prompt to work identically across all LLMs

#

Metrics & Results

Performance

Railway (free tier) + GitHub |

Diagnostic Output Quality

Tested on 50+ symptom descriptions:

HIGH severity: 94% correctly identified danger signs #

MEDIUM severity: 87% accurate, sometimes over-conservative #

LOW severity: 92% accurate, rarely misclassified as higher

#

How to Reproduce This Project

Clone & Setup

Download Knowledge Base

Set Environment Variables

Run Backend

Run Frontend

Switch LLM

Edit backend/chain.py

:

#

Deployment

Railway (Production)

Local (Offline Demo with Ollama)

#

What's Next: Phase 2 Roadmap

Planned Features

District input from user — location-aware PHC recommendations #

PHC service matching — refer only to centers with relevant services #

Distance-based ranking — haversine + service matching score #

Tiered referral logic — PHC → CHC → District Hospital based on criticality #

Offline Streamlit UI — works completely without internet #

Mobile-optimized design — tested on 2G networks

Long-term Vision

Scale to 5+ states (more PHC data, localization)
Integration with HMIS (Health Management Information System)
Real-time case tracking for health workers
Telemetry for public health dashboards
Open-source model weights (if fine-tuning becomes necessary)

#

Lessons for Other Builders

If You're Building AI for Underserved Communities #

Offline-first thinking — Design assuming no internet. Internet becomes a bonus. #

Regulatory alignment — Build with official guidelines, not against them. I used MOHFW docs, not personal judgment. #

Simple > Smart — Decision trees beat transformer magic when lives are at stake. #

Local infrastructure — Work with what exists (PHC laptops, ASHA phones). Don't demand new hardware. #

Test with users — My 95% accuracy was self-reported. Real ASHA workers will find edge cases. #

Document everything — Medical AI needs audit trails. Every recommendation is traceable to a guideline.

Technical Decisions That Scaled

Pydantic for validation — Caught hallucinations early #

ChromaDB for RAG — Persistent, no external dependencies #

FastAPI for backend — Small, fast, easy to deploy #

Streamlit for frontend — Built in 2 hours, works on any browser #

LLM abstraction — Tested 3 models without rewriting core logic

#

Challenges I'd Approach Differently

Start with smaller scope — I built the full system. Phase 1 could have been just diagnosis, Phase 2 add PHC matching. #

User research first — Built with assumptions. Should have interviewed ASHA workers before coding. #

Data quality obsession — Spent time on irrelevant chunks instead of filtering during ingest. #

Prompt engineering rigorously — Needed A/B testing framework, not trial-and-error.

#

Open Questions I'm Still Solving

Can deployment work on 2G networks? (Streamlit is heavy, need investigation) #

What's the optimal embedding model for medical Hinglish? (trade-off: size vs accuracy) #

How do we get PHC coordinates for remaining 15 locations? (Grok research pending) #

Should this be fine-tuned on medical domain? (costly, vs better prompting)

#

Repository & Demo

**GitHub:** [github.com/PriyanshuPaul79/NiDaan](https://github.com/PriyanshuPaul79/NiDaan)

[Nidaan](https://nidaan7.vercel.app/)

Tech Stack Summary:

Python 3.12, FastAPI, LangChain, ChromaDB
Groq API (development), NVIDIA NIM (quality testing), Ollama (offline)
Streamlit frontend, SQLite PHC directory
Deployed on Railway (production) + local development

#

Call to Action

If you're building healthcare tech, AI for emerging markets, or medical decision support systems: #

Drop a comment — What would you build differently? #

Star the repo — Help other builders find this approach #

Test it — Use NiDaan with Groq API (free tier). Report bugs. #

Adapt it — This architecture works for any medical RAG system (mental health, nutrition, maternity care, etc.)

**The biggest insight:** You don't need state-of-the-art models to solve real problems. You need:

- Good data (medical guidelines, not blog posts)
- Clear logic (decision trees, not neural mysticism)
- Offline capability (work without internet)
- User feedback (real ASHA workers, not assumptions)

#

Acknowledgments

MOHFW for publishing free, high-quality medical guidelines
Anthropic for Claude, Groq for the API, NVIDIA for NIM access
My college for supporting independent projects
ASHA workers across India for inspiring this work (though I haven't tested with real users yet)

Built with patience, curiosity, and way too much chai ☕

If NiDaan helps even one child get the right diagnosis at the right time, the 3 months of debugging was worth it.

#

Questions? Connect With Me

source & further reading

dev.to — original article Building a Research-Grade AI Project as a Solo Developer: My Stack, Tools, and Workflow Building a production MCP server: How we made GoodBarber agent-ready (without the glue code) Building AI That People Actually Use: Lessons Beyond the Hype