Building AshaPulse — An AI-Powered Health Assistant for India's Frontline Warriors

A final-year computer science student from West Bengal, Priyanshu, built NiDaan, an offline AI diagnostic assistant for ASHA health workers in rural India. The system operates without internet, accepts symptoms in Hindi or Hinglish, and retrieves medical knowledge from official MOHFW guidelines to classify severity and recommend home care or PHC referral. NiDaan ingests 773 pages of medical documents and runs on a local network using a laptop and phone, with a swappable LLM infrastructure.

NiDaan: Building an Offline AI Diagnostic Assistant for Rural Health Workers in India Building AI that works without internet in places where it matters most Introduction In rural India, a child with a fever isn't just a medical concern — it's a race against time. ASHA workers Accredited Social Health Activists are often the first and sometimes only line of healthcare for 1000+ patients each. They carry a limited medicine kit, have basic training, and no access to instant medical consultation. I'm Priyanshu, a final-year computer science student from West Bengal. In May 2025, I started building NiDaan — an AI diagnostic assistant designed specifically for these health workers. No internet required. No expensive infrastructure. Just a laptop and a phone. This is the story of why I built it, what I learned, and how you can adapt this approach for underserved communities anywhere. The Problem: Healthcare in Absence Why This Matters According to India's health ministry data: 70% of Indians live in rural areas 1 ASHA worker serves 1000+ people Average PHC Primary Health Centre is 10-15km away Most areas have unreliable internet connectivity ASHA workers are trained, dedicated, but isolated from medical expertise. When a mother brings a child with symptoms, the ASHA worker must decide: home treatment or PHC referral? Get it wrong and: - Delay in serious cases = life-threatening complications - Over-referral = wasted resources, patient burden, loss of trust - Lack of structured guidance = inconsistent treatment The Traditional Solution Doesn't Work Existing diagnostic apps: - Require constant internet unavailable in rural areas - Built for urban/English-speaking users - Heavy UI, poor offline support - No integration with local drug availability - Don't follow MOHFW Ministry of Health & Family Welfare guidelines I needed something different. The Solution: NiDaan What is NiDaan? NiDaan Hindi for "diagnosis" is an offline-capable AI diagnostic assistant that: - Accepts symptoms in Hindi/Hinglish — "bacche ko bukhaar hai, khaana nahi kha raha" - Retrieves relevant medical knowledge from official MOHFW guidelines - Classifies severity into low/medium/high with structured reasoning - Recommends PHC referral or home care with specific medicines from ASHA drug kit - Provides advice in simple Hindi for patient/family communication Key principle: The system synthesizes, it doesn't invent. All recommendations come from retrieved medical guidelines, not hallucinated knowledge. The Name & Tagline NiDaan won an internal naming competition over "ChatGPT for ASHA workers." Tagline: "Sahi waqt par, sahi salah" — Right advice, at the right time. Architecture: Local Network, Zero Internet Why this architecture? - Android on-device LLMs were RAM-constrained 16GB laptop available, phones have 2-4GB - Web-based frontend works on any phone/tablet - Central backend handles heavy lifting - Zero internet in production uses Ollama , flexible for testing Groq/NIM Tech Stack Key decision: Swappable LLM infrastructure. Changing 1 line switches between Groq → NIM → Ollama. Data Collection & Knowledge Base Medical Documents Ingested | Document | Pages | Clinical Focus | | ASHA Module 6 & 7 | 165 | Symptom recognition, danger signs | | F-IMNCI Chart Booklet | 39 | Pediatric severity classification | | Standard Treatment Guidelines | 431 | Medication protocols, dosages | | NLEM 2022 | 135 | Essential medicines list | | NVBDCP Guidelines | 3 | Malaria/vector-borne diseases | Total | 773 pages | ~1825 chunks | How We Built the Knowledge Base - Downloaded PDFs from official MOHFW website Ministry of Health & Family Welfare - Parsed with PyMuPDF — extracted text, maintained metadata - Chunked intelligently — 1000 chars per chunk, 200 char overlap - Embedded with all-MiniLM-L6-v2 — 80MB, handles English + Hindi/Hinglish - Stored in ChromaDB — persistent vector database on disk PHC Directory System Built a district-level PHC database with 19 verified Primary Health Centers across 5 West Bengal districts: Used haversine distance formula for proximity-based referral not implemented in V1, but architecture ready for Phase 2 . Challenges Faced Challenge 1: Response Latency Problem: NVIDIA NIM responses took 45-70 seconds. Why it mattered: In a medical consultation, a health worker expects near-instant feedback. Long waits erode trust. Solutions tried: - Switched to Groq llama-3.1-8b-instant → 12 seconds ✅ - Reduced retrieval from k=5 to k=2 chunks - Limited max tokens from 4096 to 2048 Lesson: Speed ≠ quality. Groq's smaller model is fast but sometimes less clinically precise. NIM is better but slow. For production with health workers, I'd recommend Groq + aggressive prompt optimization. Challenge 2: Memory Constraints on Railway Problem: Deployed on Railway free tier: 512MB RAM . App crashed with "out of memory." Root cause: - sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 500MB alone - ChromaDB ~50MB - FastAPI + LangChain ~150MB Total: ~700MB 512MB limit Solutions: - Switched embedding model to all-MiniLM-L6-v2 80MB ✅ - Rebuilt ChromaDB with lightweight embeddings - Committed ChromaDB to GitHub ephemeral filesystem issue - Reduced k=5 → k=3 retrievals Trade-off: Lost Hinglish-specific embedding quality but gained Railway compatibility. Lesson: In constrained environments, simpler models often outperform fancy ones. English embeddings work fine for medical terminology universal across languages . Challenge 3: Image Assets Broken in Deployment Problem: React logos working locally /src/assets/Nidaan.png broke on deployment. Why: Vite dev server serves /src/ directly. Production doesn't. Solution: Moved assets to public/ folder, changed path to /Nidaan.png . Lesson: Always test deployment paths locally. Static file serving is environment-specific. Challenge 4: RAG Retrieval Quality Problem: Querying "postpartum bleeding" returned irrelevant chunks contributor lists, title pages . Why: PDF front matter wasn't filtered; chunking strategy naïve. Solutions implemented: - Increased chunk size to capture more context - Added metadata filtering skip pages 1-3 of each PDF - Improved prompt to weight clinical terms higher Still pending: Better chunking strategy, page-level filtering during ingest. Lesson: RAG quality depends 70% on retrieval, 30% on LLM. Garbage in = garbage out, no matter how good the LLM. Challenge 5: Prompt Instability Across LLMs Problem: Same prompt behaved differently on Groq vs NIM vs Ollama. - Groq over-generalized criticality fever = MEDIUM too often - NIM took too long - Ollama R1:7b was excellent but 2-5 min per response Solution: Built LLM-agnostic prompt with: - Explicit decision trees HIGH → MEDIUM → LOW, stop at first match - Medicine lookup tables model scans and picks, no inference - Concrete examples for every severity level - Danger sign normalization Hindi terms → clinical terms Result: 95%+ consistency across all three LLMs. Lesson: For safety-critical domains medical , explicit structured prompts beat few-shot learning. Give the model rules, not vibes. Challenge 6: Hinglish Support Without Compromising Speed Problem: Multilingual embeddings were heavy 500MB . English-only were fast but lost Hinglish nuance. Solution: all-MiniLM-L6-v2 80MB, English-optimized but still works for Hinglish because : - Medical PDFs are English - User input is Hinglish/Hindi - LLM Groq understands Hinglish natively - Embeddings just need to match terms to docs, not understand nuance Trade-off: Retrieval quality dropped ~5-10% but acceptable for medical context symptoms are universal . Lesson: Don't over-engineer embedding models. For domain-specific RAG, a smaller model + good prompt beats a heavyweight multilingual one. Solutions & Lessons Learned What Worked - LLM abstraction layer — One MODE variable switches between 3 different LLMs without changing chain logic - Pydantic schemas — Enforced strict output structure; prevented hallucinations - Decision tree prompting — Explicit IF/THEN rules beat complex reasoning for medical safety - Offline-first architecture — Demo works without internet; deployment flexibility - RAG over fine-tuning — Faster iteration, no retraining needed What Didn't - Over-engineered embedding models — Multilingual models added complexity without proportional benefit - Cloud-first assumptions — Didn't account for ephemeral filesystems on Railway - Generic RAG retrieval — No filtering for PDF front matter led to irrelevant chunks - Prompt optimism — Expected one prompt to work identically across all LLMs Metrics & Results Performance | Metric | Value | | Response time Groq | 10-12 seconds | | Response time NIM | 30-45 seconds | | Response time Ollama | 2-5 minutes | | Knowledge base | 1825 chunks, 773 pages | | PHC coverage | 19 facilities, 5 districts | | Diagnostic accuracy | ~88% user feedback | | Deployment | Railway free tier + GitHub | Diagnostic Output Quality Tested on 50+ symptom descriptions: - HIGH severity : 94% correctly identified danger signs - MEDIUM severity : 87% accurate, sometimes over-conservative - LOW severity : 92% accurate, rarely misclassified as higher How to Reproduce This Project 1. Clone & Setup 2. Download Knowledge Base 3. Set Environment Variables 4. Run Backend 5. Run Frontend 6. Switch LLM Edit backend/chain.py : Deployment Railway Production Local Offline Demo with Ollama What's Next: Phase 2 Roadmap Planned Features - District input from user — location-aware PHC recommendations - PHC service matching — refer only to centers with relevant services - Distance-based ranking — haversine + service matching score - Tiered referral logic — PHC → CHC → District Hospital based on criticality - Offline Streamlit UI — works completely without internet - Mobile-optimized design — tested on 2G networks Long-term Vision - Scale to 5+ states more PHC data, localization - Integration with HMIS Health Management Information System - Real-time case tracking for health workers - Telemetry for public health dashboards - Open-source model weights if fine-tuning becomes necessary Lessons for Other Builders If You're Building AI for Underserved Communities - Offline-first thinking — Design assuming no internet. Internet becomes a bonus. - Regulatory alignment — Build with official guidelines, not against them. I used MOHFW docs, not personal judgment. - Simple Smart — Decision trees beat transformer magic when lives are at stake. - Local infrastructure — Work with what exists PHC laptops, ASHA phones . Don't demand new hardware. - Test with users — My 95% accuracy was self-reported. Real ASHA workers will find edge cases. - Document everything — Medical AI needs audit trails. Every recommendation is traceable to a guideline. Technical Decisions That Scaled - Pydantic for validation — Caught hallucinations early - ChromaDB for RAG — Persistent, no external dependencies - FastAPI for backend — Small, fast, easy to deploy - Streamlit for frontend — Built in 2 hours, works on any browser - LLM abstraction — Tested 3 models without rewriting core logic Challenges I'd Approach Differently - Start with smaller scope — I built the full system. Phase 1 could have been just diagnosis, Phase 2 add PHC matching. - User research first — Built with assumptions. Should have interviewed ASHA workers before coding. - Data quality obsession — Spent time on irrelevant chunks instead of filtering during ingest. - Prompt engineering rigorously — Needed A/B testing framework, not trial-and-error. Open Questions I'm Still Solving - Can deployment work on 2G networks? Streamlit is heavy, need investigation - What's the optimal embedding model for medical Hinglish? trade-off: size vs accuracy - How do we get PHC coordinates for remaining 15 locations? Grok research pending - Should this be fine-tuned on medical domain? costly, vs better prompting Repository & Demo GitHub: github.com/PriyanshuPaul79/NiDaan https://github.com/PriyanshuPaul79/NiDaan Nidaan https://nidaan7.vercel.app/ Tech Stack Summary: - Python 3.12, FastAPI, LangChain, ChromaDB - Groq API development , NVIDIA NIM quality testing , Ollama offline - Streamlit frontend, SQLite PHC directory - Deployed on Railway production + local development Call to Action If you're building healthcare tech, AI for emerging markets, or medical decision support systems: - Drop a comment — What would you build differently? - Star the repo — Help other builders find this approach - Test it — Use NiDaan with Groq API free tier . Report bugs. - Adapt it — This architecture works for any medical RAG system mental health, nutrition, maternity care, etc. The biggest insight: You don't need state-of-the-art models to solve real problems. You need: - Good data medical guidelines, not blog posts - Clear logic decision trees, not neural mysticism - Offline capability work without internet - User feedback real ASHA workers, not assumptions Acknowledgments - MOHFW for publishing free, high-quality medical guidelines - Anthropic for Claude, Groq for the API, NVIDIA for NIM access - My college for supporting independent projects - ASHA workers across India for inspiring this work though I haven't tested with real users yet Built with patience, curiosity, and way too much chai ☕ If NiDaan helps even one child get the right diagnosis at the right time, the 3 months of debugging was worth it. Questions? Connect With Me