cd /news/ai-agents/how-i-gave-my-ai-agent-persistent-me… · home topics ai-agents article
[ARTICLE · art-25164] src=dev.to pub= topic=ai-agents verified=true sentiment=↑ positive

How I Gave My AI Agent Persistent Memory Without Modifying Its Code

A developer built Memory Sidecar, an open-source tool that gives AI agents persistent memory across sessions without modifying their code. The system runs alongside agents like Claude Code and Cursor, archiving conversations and injecting relevant context before each new session. It uses a three-tier retrieval system to select only the most pertinent information rather than dumping everything into the prompt.

read3 min publishedJun 12, 2026

If you've ever worked with AI agents in production, you know the frustration: every new session starts from scratch. The agent has no memory of previous conversations, no context about ongoing projects, and you have to repeat yourself constantly. It's like Groundhog Day for your AI. I ran into this with a code assistant I was using for a multi-week refactoring project. It was great for one-off questions, but it couldn't remember what we discussed yesterday. I'd ask it about the architecture decisions we made last week, and it would stare at me blankly. I needed something that could carry context across sessions without forcing me to patch the agent's internals.

I looked at the usual suspects: vector databases for RAG, ad-hoc session dumping, even fine-tuning. Each had a cost. RAG setups are powerful but often require custom tooling and tight integration. Session logs without structure are just noise. Fine-tuning is expensive and slow to iterate on. What I wanted was a self-contained system that worked with any agent, required no code changes to the agent, and actually understood what to keep and what to forget.

That's when I found Memory Sidecar. It's an open-source project designed to run alongside any AI agent—Hermes, Claude Code, Cursor, Codex, or your own custom setup—as a separate process. It watches your agent's output, archives important conversations, builds a long-term knowledge base, and injects relevant context back before each new session. No patches, no invasive changes.

The architecture is simple on the surface but layered underneath. Agents write sessions to state.db and session files. The sidecar reads these, processes new content, and feeds through a three-tier retrieval system:

These layers are queried during context injection. The system doesn't dump everything into the prompt; it selects what's most relevant from each tier. Recent context goes straight in, older knowledge is surfaced only when the agent's current task relates to it.

I'm running it with a local LLM agent for code review. The sidecar monitors session files, builds dossiers on topics like "authentication refactor" and "database indexing", and tracks discussions across multiple conversations. When I start a new session, it injects a concise summary of last week's work—no need to rehash decisions.

The project also includes practical tools: memory_watermark.py

for automatic archival when memory grows too large, and `memory_snapshot_backup.py`

for periodic snapshots. For multi-agent setups, `session_to_gbrain.py`

syncs sessions into the knowledge graph, and hindsight-service.py

runs the warm layer as an independent daemon. Full details are in the HERMES_ONBOARDING.md

guide, which walks through connecting other agents.

Memory Sidecar shines when your agent supports system prompt injection or tool-based context. Most modern coding agents do. If yours doesn't, you'll need a small bridge to pipe the context in. The project's MCP bridge (hindsight_mcp_bridge.py

) is a good starting point.

It's not a vector store replacement for large-scale corpus search. It's purpose-built for session-level memory—keeping what an agent experienced across conversations. That's a narrower scope, but one that many of us working with interactive agents actually need.

The project is at v3.1.1 (MIT license) and the repo includes clear docs for setup and architecture. Clone it, point the sidecar to your agent's data directory, and run the service. There's a quickstart in the README and a more detailed ARCHITECTURE.md if you want to understand the internals.

Check it out on GitHub: [https://github.com/mage0535/hermes-memory-installer](https://github.com/mage0535/hermes-memory-installer)

If you've been fighting with context retention, give it a try and see if it solves the same pain it fixed for me.
── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-i-gave-my-ai-age…] indexed:0 read:3min 2026-06-12 ·