{"slug": "the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture", "title": "The .txt File as the Soul of a Personal AI — FileRAG Memory Architecture", "summary": "Developer Dharanidharan J (JD) has built FileRAG, a memory architecture for personal AI that stores a distilled understanding of each user in a single plain-text `.txt` file. The system uses hybrid BM25 and semantic retrieval to inject relevant context from the file into conversations, and employs topic-drift detection and deduplication to maintain clean, accurate summaries. In benchmarks, FileRAG achieved 100% retrieval accuracy across 500 and 1000 conversation turns, while traditional dict and Redis-based systems failed completely at those scales.", "body_md": "**By Dharanidharan J (JD)**\n\n*Full Stack & AI Engineer | Building Jarvix*\n\nEvery chatbot tutorial teaches you the same thing:\n\n```\nhistory = []\nhistory.append({\"role\": \"user\", \"content\": message})\n```\n\nAnd that works — until it doesn't.\n\nAfter 500 turns, your dict has forgotten who the user is. After 1000 turns, you're hitting token limits. After a restart, everything is gone. Redis helps with persistence but still buries early facts under noise. Vector DBs help with retrieval but bloat storage and need infrastructure.\n\n**What if the memory itself was just a file?**\n\nEvery conversation a user has gets distilled into a plain `.txt`\n\nfile. That file is the brain. On every new query, a hybrid BM25 + semantic RAG retrieves the most relevant chunks from it and injects them as context.\n\n```\nusers/\n└── jd.txt        ← the soul file\n```\n\nThe soul file looks like this:\n\n```\n[Turns 1-5]\n- User's name is JD, software engineer\n- Building FileRAG, a novel memory architecture\n- Uses Pop!_OS with Fish shell and NVIDIA GPU\n\n[Turns 6-10]\n- Has a cat named Pixel who distracts during coding\n- Paused TaskNest due to burnout\n- Now focused on AgenticMesh\n```\n\nHuman readable. Editable. Yours.\n\nMost memory systems store **messages**. FileRAG stores a **relationship**.\n\n| System | What it stores |\n|---|---|\n| Dict / Redis | Raw message objects |\n| Vector DB | Embeddings of messages |\nFileRAG |\nDistilled understanding of the user |\n\nThe longer you use it, the more the AI understands you — not because it has more messages, but because it has a better summary of who you are.\n\n```\nUser message\n     ↓\nTopic drift check (cosine similarity)\n     ├── Drift detected → distill current buffer immediately\n     └── No drift → continue\n     ↓\nHybrid retrieval (BM25 + ChromaDB) from soul file\n     ↓\nInject context → LLM responds\n     ↓\nAppend to turn buffer\n     ↓\nEvery 5 turns → distill → append to soul file → update ChromaDB\n     ↓\nEmergency distillation on exit (SIGINT/SIGTERM)\n```\n\n**1. Topic-Drift Distillation**\n\nInstead of waiting every N turns blindly, the system measures semantic similarity between the current buffer and the new message. If similarity drops below 0.25, it immediately distills and starts a fresh block. This keeps topic chunks clean and isolated.\n\n**2. Deduplication**\n\nBefore writing any new chunk, cosine similarity is checked against all existing chunks. If >92% similar, the chunk is skipped. This prevents filler conversations from polluting the soul file.\n\n**3. Emergency Exit Handler**\n\n`SIGINT`\n\nand `SIGTERM`\n\nare intercepted. On Ctrl+C, the current buffer is immediately distilled before the process exits. Nothing is lost.\n\n**4. Hybrid Retrieval**\n\nBM25 catches exact keywords (project names, usernames). Semantic search catches meaning (preferences, personality). Together they outperform either alone.\n\n*Tested on Pop!_OS, NVIDIA GPU, sentence-transformers all-MiniLM-L6-v2 embeddings*\n\n| Metric | Dict | Redis | Vector DB | FileRAG |\n|---|---|---|---|---|\n| Write Speed (ms) | 0.0004 |\n0.30 | 33.38 | 20.33 |\n| Read Speed (ms) | 0.002 |\n0.26 | 6.64 | 9.26 |\n| Storage (KB) | 1.42 | 1.38 | 396.16 | 356.66 |\n| Accuracy | 100% | 100% | 67% | 100% |\n| Persistent | ❌ | ✅ | ✅ | ✅ |\n\nAt small scale, Dict and Redis win on speed. FileRAG matches on accuracy. Fair.\n\n| Metric | Dict | Redis | Vector DB | FileRAG |\n|---|---|---|---|---|\n| Write Speed (ms) | 0.0002 |\n0.08 | 22.23 | 18.93 |\n| Read Speed (ms) | 0.002 |\n0.24 | 8.51 | 7.64 |\n| Storage (KB) | 34.75 | 33.77 | 1604.16 | 653.47 |\n| Accuracy |\n0% ❌ |\n0% ❌ |\n67% | 100% |\n\nThis is where it gets interesting. Dict and Redis completely fail — core facts buried under 490 turns of noise. FileRAG still retrieves perfectly.\n\n| Metric | Dict | Redis | Vector DB | FileRAG |\n|---|---|---|---|---|\n| Storage (KB) | 69.47 | 67.51 | 4338.36 | 938.74 |\n| Soul file only (KB) | — | — | — | 18.58 |\n| Accuracy |\n0% ❌ |\n0% ❌ |\n67% | 100% |\n\nFileRAG's total storage includes ChromaDB index overhead. The soul file itself — the actual human-readable memory — is just **18 KB** for 1000 turns.\n\n---|---|---|---|---|\n\n| Storage (KB) | 3,478 | 3,381 | 76,159 | **29,812** |\n\n| Soul file (KB) | — | — | — | **~1,865** |\n\n| Accuracy | 0% | 0% | ~67% | **~100%** |\n\nAt 100k turns, Vector DB would consume ~74 MB just for index storage. FileRAG's soul file stays under 2 MB — human-readable, portable, private.\n\n| Category | Dict | Redis | Vector DB | FileRAG |\n|---|---|---|---|---|\n| Fastest write | ✅ | ✅ | ❌ | Medium |\n| Best accuracy at scale | ❌ | ❌ | Medium | ✅ |\n| Smallest storage at scale | ❌ | ❌ | ❌ | ✅ |\n| Persistent | ❌ | ✅ | ✅ | ✅ |\n| No infrastructure | ✅ | ❌ | ❌ | ✅ |\n| Local / offline | ✅ | ❌ | ⚠️ | ✅ |\n| Privacy (on device) | ✅ | ❌ | ⚠️ | ✅ |\n| Grows naturally with user | ❌ | ❌ | Medium | ✅ |\n\nFileRAG is not the fastest. It is not the simplest. But it is the **only architecture that gets more accurate as the conversation grows**, without growing infrastructure requirements.\n\nYour brain doesn't record every conversation verbatim. It compresses experiences into memory — feelings, facts, patterns. The hippocampus distills, the cortex stores.\n\nFileRAG does the same thing:\n\n```\nConversation → Distillation → Soul file → Retrieval → Natural response\nExperience  → Hippocampus  → Cortex    → Recall    → Natural behaviour\n```\n\nThe soul file is not a database. It is a diary the AI reads before speaking to you.\n\n```\nLLM          → Groq (llama3-70b-8192)\nDistillation → Groq (llama3-70b-8192) every 5 turns or on topic drift\nEmbeddings   → sentence-transformers/all-MiniLM-L6-v2\nVector store → ChromaDB (persistent)\nRetrieval    → Hybrid BM25 + Cosine Semantic\nMemory       → {user_id}.txt — the soul file\n```\n\nThe full implementation — `main.py`\n\n, `benchmark.py`\n\n, and the architecture — is available on GitHub:\n\n**→ github.com/dharanidh75/filerag-memory**\n\nThis is also the memory layer being built into **Jarvix** — a local-first voice AI assistant for Pop!_OS.\n\n*If you're building a local AI, a personal assistant, or just tired of your chatbot forgetting who you are after every restart — give FileRAG a try.*\n\n*The soul file is 18 KB. Your AI deserves better than a dict.*", "url": "https://wpnews.pro/news/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture", "canonical_source": "https://dev.to/dharanidh75/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture-k71", "published_at": "2026-05-30 06:45:32+00:00", "updated_at": "2026-05-30 07:11:22.815498+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-agents", "ai-tools", "ai-infrastructure", "large-language-models"], "entities": ["Dharanidharan J", "Jarvix", "FileRAG", "Redis", "Vector DB", "BM25", "Pop!_OS", "NVIDIA"], "alternates": {"html": "https://wpnews.pro/news/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture", "markdown": "https://wpnews.pro/news/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture.md", "text": "https://wpnews.pro/news/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture.txt", "jsonld": "https://wpnews.pro/news/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture.jsonld"}}