# The .txt File as the Soul of a Personal AI — FileRAG Memory Architecture

> Source: <https://dev.to/dharanidh75/the-txt-file-as-the-soul-of-a-personal-ai-filerag-memory-architecture-k71>
> Published: 2026-05-30 06:45:32+00:00

**By Dharanidharan J (JD)**

*Full Stack & AI Engineer | Building Jarvix*

Every chatbot tutorial teaches you the same thing:

```
history = []
history.append({"role": "user", "content": message})
```

And that works — until it doesn't.

After 500 turns, your dict has forgotten who the user is. After 1000 turns, you're hitting token limits. After a restart, everything is gone. Redis helps with persistence but still buries early facts under noise. Vector DBs help with retrieval but bloat storage and need infrastructure.

**What if the memory itself was just a file?**

Every conversation a user has gets distilled into a plain `.txt`

file. That file is the brain. On every new query, a hybrid BM25 + semantic RAG retrieves the most relevant chunks from it and injects them as context.

```
users/
└── jd.txt        ← the soul file
```

The soul file looks like this:

```
[Turns 1-5]
- User's name is JD, software engineer
- Building FileRAG, a novel memory architecture
- Uses Pop!_OS with Fish shell and NVIDIA GPU

[Turns 6-10]
- Has a cat named Pixel who distracts during coding
- Paused TaskNest due to burnout
- Now focused on AgenticMesh
```

Human readable. Editable. Yours.

Most memory systems store **messages**. FileRAG stores a **relationship**.

| System | What it stores |
|---|---|
| Dict / Redis | Raw message objects |
| Vector DB | Embeddings of messages |
FileRAG |
Distilled understanding of the user |

The longer you use it, the more the AI understands you — not because it has more messages, but because it has a better summary of who you are.

```
User message
     ↓
Topic drift check (cosine similarity)
     ├── Drift detected → distill current buffer immediately
     └── No drift → continue
     ↓
Hybrid retrieval (BM25 + ChromaDB) from soul file
     ↓
Inject context → LLM responds
     ↓
Append to turn buffer
     ↓
Every 5 turns → distill → append to soul file → update ChromaDB
     ↓
Emergency distillation on exit (SIGINT/SIGTERM)
```

**1. Topic-Drift Distillation**

Instead of waiting every N turns blindly, the system measures semantic similarity between the current buffer and the new message. If similarity drops below 0.25, it immediately distills and starts a fresh block. This keeps topic chunks clean and isolated.

**2. Deduplication**

Before writing any new chunk, cosine similarity is checked against all existing chunks. If >92% similar, the chunk is skipped. This prevents filler conversations from polluting the soul file.

**3. Emergency Exit Handler**

`SIGINT`

and `SIGTERM`

are intercepted. On Ctrl+C, the current buffer is immediately distilled before the process exits. Nothing is lost.

**4. Hybrid Retrieval**

BM25 catches exact keywords (project names, usernames). Semantic search catches meaning (preferences, personality). Together they outperform either alone.

*Tested on Pop!_OS, NVIDIA GPU, sentence-transformers all-MiniLM-L6-v2 embeddings*

| Metric | Dict | Redis | Vector DB | FileRAG |
|---|---|---|---|---|
| Write Speed (ms) | 0.0004 |
0.30 | 33.38 | 20.33 |
| Read Speed (ms) | 0.002 |
0.26 | 6.64 | 9.26 |
| Storage (KB) | 1.42 | 1.38 | 396.16 | 356.66 |
| Accuracy | 100% | 100% | 67% | 100% |
| Persistent | ❌ | ✅ | ✅ | ✅ |

At small scale, Dict and Redis win on speed. FileRAG matches on accuracy. Fair.

| Metric | Dict | Redis | Vector DB | FileRAG |
|---|---|---|---|---|
| Write Speed (ms) | 0.0002 |
0.08 | 22.23 | 18.93 |
| Read Speed (ms) | 0.002 |
0.24 | 8.51 | 7.64 |
| Storage (KB) | 34.75 | 33.77 | 1604.16 | 653.47 |
| Accuracy |
0% ❌ |
0% ❌ |
67% | 100% |

This is where it gets interesting. Dict and Redis completely fail — core facts buried under 490 turns of noise. FileRAG still retrieves perfectly.

| Metric | Dict | Redis | Vector DB | FileRAG |
|---|---|---|---|---|
| Storage (KB) | 69.47 | 67.51 | 4338.36 | 938.74 |
| Soul file only (KB) | — | — | — | 18.58 |
| Accuracy |
0% ❌ |
0% ❌ |
67% | 100% |

FileRAG's total storage includes ChromaDB index overhead. The soul file itself — the actual human-readable memory — is just **18 KB** for 1000 turns.

---|---|---|---|---|

| Storage (KB) | 3,478 | 3,381 | 76,159 | **29,812** |

| Soul file (KB) | — | — | — | **~1,865** |

| Accuracy | 0% | 0% | ~67% | **~100%** |

At 100k turns, Vector DB would consume ~74 MB just for index storage. FileRAG's soul file stays under 2 MB — human-readable, portable, private.

| Category | Dict | Redis | Vector DB | FileRAG |
|---|---|---|---|---|
| Fastest write | ✅ | ✅ | ❌ | Medium |
| Best accuracy at scale | ❌ | ❌ | Medium | ✅ |
| Smallest storage at scale | ❌ | ❌ | ❌ | ✅ |
| Persistent | ❌ | ✅ | ✅ | ✅ |
| No infrastructure | ✅ | ❌ | ❌ | ✅ |
| Local / offline | ✅ | ❌ | ⚠️ | ✅ |
| Privacy (on device) | ✅ | ❌ | ⚠️ | ✅ |
| Grows naturally with user | ❌ | ❌ | Medium | ✅ |

FileRAG is not the fastest. It is not the simplest. But it is the **only architecture that gets more accurate as the conversation grows**, without growing infrastructure requirements.

Your brain doesn't record every conversation verbatim. It compresses experiences into memory — feelings, facts, patterns. The hippocampus distills, the cortex stores.

FileRAG does the same thing:

```
Conversation → Distillation → Soul file → Retrieval → Natural response
Experience  → Hippocampus  → Cortex    → Recall    → Natural behaviour
```

The soul file is not a database. It is a diary the AI reads before speaking to you.

```
LLM          → Groq (llama3-70b-8192)
Distillation → Groq (llama3-70b-8192) every 5 turns or on topic drift
Embeddings   → sentence-transformers/all-MiniLM-L6-v2
Vector store → ChromaDB (persistent)
Retrieval    → Hybrid BM25 + Cosine Semantic
Memory       → {user_id}.txt — the soul file
```

The full implementation — `main.py`

, `benchmark.py`

, and the architecture — is available on GitHub:

**→ github.com/dharanidh75/filerag-memory**

This is also the memory layer being built into **Jarvix** — a local-first voice AI assistant for Pop!_OS.

*If you're building a local AI, a personal assistant, or just tired of your chatbot forgetting who you are after every restart — give FileRAG a try.*

*The soul file is 18 KB. Your AI deserves better than a dict.*
