{"slug": "how-to-add-persistent-memory-to-claude-code-storage-injection-and-recall", "title": "How to Add Persistent Memory to Claude Code: Storage, Injection, and Recall", "summary": "Claude Code lacks persistent memory across sessions, losing architectural decisions, bug fixes, and project context. Tools like Memarch, Hermes, and GBrain offer solutions for storage, context injection, and semantic recall to address this gap.", "body_md": "# How to Add Persistent Memory to Claude Code: Storage, Injection, and Recall\n\nClaude Code's built-in memory captures almost nothing. Here's how to add Memarch, Hermes, or GBrain for real storage, injection, and semantic recall.\n\n## Claude Code’s Memory Problem Is Worse Than You Think\n\nClaude Code is an impressive coding agent out of the box. But there’s a gap most people hit within the first few sessions: it remembers nothing.\n\nEvery time you start a new Claude Code session, you’re starting from scratch. No record of the architecture decisions you made last Tuesday. No awareness that you renamed the authentication module. No memory of the debugging rabbit hole that took three hours and ended with a one-line fix. The persistent memory problem is one of the biggest practical friction points when using Claude Code for real, ongoing projects.\n\nThis guide covers how to fix that — through proper storage, context injection, and semantic recall. Tools like Memarch, Hermes, and GBrain represent three distinct approaches to solving this, and understanding the difference between them will help you build a memory layer that actually holds up in production.\n\n## What Claude Code Actually Remembers (And What It Doesn’t)\n\nBefore adding any memory tooling, it helps to understand what Claude Code does and doesn’t retain natively.\n\n### CLAUDE.md: The Only Built-In Persistence\n\nClaude Code reads a `CLAUDE.md`\n\nfile at the root of your project on session start. This is the only real persistent memory it has out of the box. You can put anything in it — project context, tech stack, naming conventions, known issues — and Claude will pick it up.\n\n### Everyone else built a construction worker.\n\nWe built the contractor.\n\nOne file at a time.\n\nUI, API, database, deploy.\n\nBut `CLAUDE.md`\n\nis a flat text file. It doesn’t update automatically. It doesn’t grow or reorganize based on what you do during sessions. It doesn’t retrieve relevant context based on what you’re currently working on. It’s a sticky note, not a memory system.\n\n### Context Window Limits\n\nClaude’s context window is large, but it’s session-scoped. Once you close a session, everything that happened in that conversation is gone. You can paste in prior context manually, but that doesn’t scale.\n\n### What Gets Lost\n\nHere’s what actually disappears between sessions without a memory layer:\n\n- Architectural decisions and the reasoning behind them\n- Bug fixes and what caused the bugs\n- File structure changes and refactors\n- Custom patterns, conventions, or shortcuts you’ve established\n- Task history and what was completed, in-progress, or abandoned\n- External research you surfaced during a session\n\nThat’s not a minor gap. For any project running longer than a few days, this is a real productivity drain.\n\n## The Three Pillars of Persistent Memory\n\nAny serious memory system for Claude Code needs to handle three things independently:\n\n**Storage** — Where memories live and how they’re structured. This could be a flat file, a database, a vector store, or a knowledge graph.\n\n**Injection** — How the right memories get loaded into Claude’s context at the start of a session or during one. Not all memories are relevant all the time, so you need a mechanism to decide what to include.\n\n**Recall** — How memories are retrieved when Claude needs them mid-session. This includes both exact lookups and semantic similarity search.\n\nDifferent tools focus on different layers of this stack. Understanding which layer a tool operates on helps you decide what to reach for and how to combine tools effectively.\n\n## Memarch: Structuring What Gets Stored\n\nMemarch is built around the storage problem. The core insight it’s designed around is that unstructured memory is nearly useless at scale. If you dump session transcripts into a file and try to search it later, you’ll hit noise, contradictions, and retrieval failures.\n\n### Memory as Typed Records\n\nMemarch organizes memories into typed records rather than freeform text. A decision memory looks different from a bug memory. A convention memory gets structured differently from a dependency note. This typing is important because it determines how memories get summarized, deduplicated, and retrieved later.\n\nAt a practical level, this means when you record a memory through Memarch, you’re tagging it with a category and a set of metadata fields — not just appending text to a log. That structure makes retrieval far more reliable.\n\n### Deduplication and Merging\n\nOne problem with naive memory systems is accumulation. If you note the same architectural pattern five times across five sessions, you get five copies of that memory. Memarch handles deduplication at write time, merging related records and updating existing ones rather than creating new copies.\n\nThis matters for injection quality. If your context gets loaded with ten nearly-identical memories, you’re wasting token budget and introducing noise.\n\n### When to Use Memarch\n\nMemarch works best when:\n\n- Your project has a high volume of decisions, conventions, and evolving structure\n- You need memory to be queryable by type (e.g., “show me all architecture decisions”)\n- You want storage that grows cleanly over time without manual curation\n\n## Hermes: Getting Memory Into Context\n\nStorage is only useful if the right memories actually reach Claude. That’s the injection problem, and it’s where Hermes focuses.\n\n### Context Injection Patterns\n\nThere are a few ways to inject memory into a Claude Code session:\n\n**Prepend to CLAUDE.md**— Write relevant memories into`CLAUDE.md`\n\nbefore the session starts, based on what task is active.**System prompt injection**— Load memories into the system prompt dynamically.** In-session injection**— Surface memories mid-session when Claude detects it’s entering a domain where prior context exists.\n\nHermes operates primarily as an injection layer, sitting between your memory store and the Claude Code session. When you start a session, Hermes scores stored memories against the current task or project context and injects the highest-relevance subset.\n\n### Relevance Scoring\n\nRaw injection — loading everything into context — defeats the purpose. Token budgets are real, and flooding context with tangentially related memories is worse than loading nothing.\n\nHermes applies relevance scoring using a combination of:\n\n**Recency weighting**— More recent memories score higher by default** Tag matching**— Memories tagged with the current task type rank up** Semantic similarity**— Memory embeddings are compared against the current task description\n\nThe output is a ranked, trimmed set of memories sized to fit a target context budget. You can configure how aggressive the trimming is.\n\n### Hooking Into Claude Code’s Startup\n\nThe practical integration path with Hermes is via a startup hook. Claude Code supports running a script before a session begins — you can use that to call Hermes, build the injection payload, and write it to `CLAUDE.md`\n\nor a context file that Claude reads at initialization.\n\nThis keeps the injection transparent. Claude doesn’t need to know it’s reading injected memories vs. static documentation. It just gets context.\n\n## GBrain: Semantic Recall During Sessions\n\nStorage and injection handle what Claude knows at the start of a session. But what about mid-session recall? That’s where GBrain enters.\n\n### Graph-Based Memory\n\nGBrain uses a knowledge graph structure rather than a flat vector store. Memories are nodes. Relationships between them are edges. When you recall a memory, GBrain can traverse the graph — not just finding the closest match, but also surfacing related context that a pure vector search would miss.\n\nFor example: if you query GBrain about the authentication module, it doesn’t just return memories explicitly tagged with “auth.” It also returns memories about the session token implementation, the middleware that depends on auth, and the bug you fixed in the login flow three weeks ago — because those nodes are connected in the graph.\n\nThis relational recall is the key differentiator. For complex codebases where concepts are deeply interconnected, graph traversal produces more useful recall than pure embedding similarity.\n\n### MCP Server Integration\n\nGBrain exposes itself as an MCP (Model Context Protocol) server, which means Claude Code can call it directly during a session using tool use. The workflow looks like this:\n\n- Claude determines it needs prior context on a topic\n- It calls the GBrain MCP tool with a query\n- GBrain runs a graph search and returns relevant memories\n- Claude incorporates those memories into its response\n\nThis is real-time recall — not just upfront injection. Claude can query memory as needed throughout a session rather than relying on what was loaded at startup.\n\n### Setting Up GBrain as an MCP Server\n\nClaude Code’s MCP configuration lives in `.claude/mcp_config.json`\n\n. Adding GBrain as a server is a matter of registering the endpoint and providing auth credentials. Once registered, GBrain appears as a callable tool in Claude’s toolset, and Claude can invoke it autonomously when it determines it needs historical context.\n\n## Building a Complete Memory Stack\n\nYou don’t have to choose between these tools — they address different layers and can work together.\n\n### A Practical Architecture\n\nHere’s a stack that covers all three layers:\n\n| Layer | Tool | When It Runs |\n|---|---|---|\n| Storage | Memarch | After each session, on memory write |\n| Injection | Hermes | At session start |\n| Recall | GBrain | Mid-session, on demand |\n\n**Workflow:**\n\n- During a session, Claude records significant decisions, bugs, and context changes to Memarch via a write hook or manual command.\n- When the next session starts, Hermes queries Memarch, scores memories against the current task, and injects relevant context into\n`CLAUDE.md`\n\n. - During the session, Claude can call GBrain’s MCP endpoint for deeper recall on specific topics.\n\nThis layered approach gives you coverage at every point in the session lifecycle — before, during, and after.\n\n### A Simpler Starting Point\n\nIf the full stack feels like overkill, start with Hermes alone:\n\n- Manually write key memories to a structured JSON or YAML store after each session\n- Use Hermes to score and inject them at session start\n- Add Memarch or GBrain later when your memory store grows and retrieval quality degrades\n\nDon’t over-engineer the memory layer before you have real volume. A few hundred well-structured memories will perform better than a sophisticated system with poorly formatted content.\n\n## Common Mistakes When Adding Memory to Claude Code\n\nA few patterns that consistently cause problems:\n\n**Storing too much.** Memory dumps from full session transcripts are low signal. Store decisions, not dialogue. Store the outcome of a debugging session, not the debugging process.\n\n**No memory hygiene.** Memories go stale. A decision you made in month one might be wrong by month three. Build a review cycle for old memories — either manual or automated — to mark outdated records.\n\n**Context overloading.** Injecting 50 memories because you can fit them in context doesn’t mean you should. Token budget spent on marginally relevant memories is budget not available for the actual task. Be aggressive about trimming injection payloads.\n\n**Skipping the write step.** The best memory system is useless if you never populate it. Build the habit of writing key context to memory at the end of every session. Some teams automate this with a session-end hook that prompts Claude to summarize what should be recorded.\n\n**Treating CLAUDE.md as the only option.** CLAUDE.md is fine for static project context. It’s not a replacement for a real memory system on any project with more than a few weeks of history.\n\n## How MindStudio Handles Persistent Memory for AI Agents\n\nIf you’re thinking about memory for Claude Code specifically, that’s one use case. But the broader problem — AI agents that lose context between sessions — shows up everywhere: customer support bots that forget users, research agents that repeat work, workflow agents that can’t build on past outputs.\n\n[MindStudio](https://mindstudio.ai) handles this at the platform level. When you build an agent in MindStudio, you get built-in data storage that persists across sessions, natively scoped to users, conversations, or global state. You don’t need to wire up a separate memory layer — it’s part of the agent builder.\n\nFor developers who want to extend this to external agents (including Claude Code), MindStudio’s [Agent Skills Plugin](https://mindstudio.ai/blog/introducing-the-mindstudio-agent-skills-plugin) exposes storage and retrieval as simple method calls. Your Claude Code agent can call `agent.storeMemory()`\n\nand `agent.recallMemory()`\n\nwithout managing the storage infrastructure yourself.\n\nThe platform also supports [building fully autonomous background agents](https://mindstudio.ai/blog/how-to-build-autonomous-ai-agents) that can run on a schedule — which opens up patterns like nightly memory consolidation jobs that summarize, deduplicate, and restructure what an agent learned during the day.\n\nYou can try MindStudio free at [mindstudio.ai](https://mindstudio.ai).\n\n## Frequently Asked Questions\n\n### Does Claude Code have any built-in persistent memory?\n\nThe only native persistence is `CLAUDE.md`\n\n— a static file Claude reads at session start. It doesn’t update automatically, doesn’t grow based on session activity, and doesn’t support semantic retrieval. For any project with meaningful history, you need an external memory layer.\n\n### What is MCP and how does it enable memory for Claude Code?\n\nMCP (Model Context Protocol) is Anthropic’s standard for giving Claude access to external tools and data sources. Memory systems like GBrain can expose themselves as MCP servers, allowing Claude to call them directly during a session. This enables real-time recall — Claude can query memory mid-session rather than relying only on what was injected at startup. You configure MCP servers in `.claude/mcp_config.json`\n\n.\n\n### What’s the difference between storage, injection, and recall?\n\nThese are three distinct operations in a memory system. **Storage** is writing memories to a persistent location after they’re created. **Injection** is loading relevant memories into context at the start of a session. **Recall** is retrieving memories on demand during a session, typically using semantic or graph search. A complete memory layer needs all three — many tools only address one or two.\n\n### How do I decide what to store in Claude Code’s memory?\n\nFocus on things that would take time to re-establish if Claude forgot them: architectural decisions, naming conventions, recurring bugs and their fixes, module relationships, and active task state. Avoid storing raw conversation transcripts or process details — they add noise without adding signal. A good rule of thumb: if you’d have to explain it again from scratch next session, it should be in memory.\n\n### Can I use vector search for Claude Code memory?\n\nYes. Vector (embedding) search is a solid foundation for recall — you store memories as embeddings and retrieve by semantic similarity. Tools like Hermes use this for injection scoring. The limitation is that vector search returns individual similar records but misses relational context. Graph-based systems like GBrain can surface connected memories that a pure vector search would miss. For most projects, vector search is a good starting point.\n\n### How much memory context should I inject per session?\n\n## Other agents start typing. Remy starts asking.\n\nScoping, trade-offs, edge cases — the real work. Before a line of code.\n\nThere’s no universal answer, but a practical heuristic is to stay under 20–30% of your effective context window with injected memory. That leaves enough headroom for the actual task, codebase context, and Claude’s outputs. Most projects hit a good balance with 10–20 high-relevance memories per session rather than trying to inject everything.\n\n## Key Takeaways\n\n- Claude Code’s built-in memory is limited to static\n`CLAUDE.md`\n\n— there’s no automatic persistence between sessions. - A complete persistent memory layer needs three things:\n**storage**(where memories live),** injection**(loading relevant memories into context at session start), and** recall**(querying memories mid-session). - Memarch handles structured storage with typed records and deduplication. Hermes handles context injection with relevance scoring. GBrain handles graph-based semantic recall via MCP.\n- These tools address different layers and can be combined — start with the layer that creates the most friction and add others as your memory store grows.\n- Common failure modes: storing too much noise, skipping the write step, and over-injecting into context. Keep memory lean and high-signal.\n\nIf you want persistent memory handled at the platform level — without managing your own storage infrastructure — [MindStudio](https://mindstudio.ai) builds it into the agent builder, with integrations that extend to external agents including Claude Code.", "url": "https://wpnews.pro/news/how-to-add-persistent-memory-to-claude-code-storage-injection-and-recall", "canonical_source": "https://www.mindstudio.ai/blog/how-to-add-persistent-memory-claude-code/", "published_at": "2026-06-24 00:00:00+00:00", "updated_at": "2026-06-24 14:13:49.591066+00:00", "lang": "en", "topics": ["large-language-models", "ai-tools", "developer-tools"], "entities": ["Claude Code", "Memarch", "Hermes", "GBrain", "Anthropic"], "alternates": {"html": "https://wpnews.pro/news/how-to-add-persistent-memory-to-claude-code-storage-injection-and-recall", "markdown": "https://wpnews.pro/news/how-to-add-persistent-memory-to-claude-code-storage-injection-and-recall.md", "text": "https://wpnews.pro/news/how-to-add-persistent-memory-to-claude-code-storage-injection-and-recall.txt", "jsonld": "https://wpnews.pro/news/how-to-add-persistent-memory-to-claude-code-storage-injection-and-recall.jsonld"}}