# We Tried 6 Memory Providers for Hermes Agent — Here's What We Learned

> Source: <https://dev.to/mariatanbobo/we-tried-6-memory-providers-for-hermes-agent-heres-what-we-learned-5ehm>
> Published: 2026-05-27 00:05:09+00:00

Giving an AI agent persistent memory sounds simple. Store facts. Recall them later. How hard can it be?

Three weeks and six providers later, I have opinions.

This is the story of what broke, what we discarded, and the one thing that finally worked — and why.

I run [Hermes Agent](https://github.com/nousresearch/hermes-agent) on a headless VPS with 4GB RAM. Nothing exotic. The goal was straightforward: the agent should remember things across sessions — my preferences, environment details, lessons learned — without me repeating myself every conversation.

Hermes ships with several bundled memory providers and supports third-party ones via plugins. Should be plug-and-play, right?

The first provider we had. Node.js runtime, Docker container for the iii-engine, 860 memories at peak. It *seemed* fine.

Then we switched to a different provider to try it out. AgentMemory's ingestion died instantly — but nothing told us. Tools responded normally. No errors in logs. Just… nothing was being stored anymore.

**Root cause:** Hermes supports exactly one active memory provider. The switch disabled AgentMemory's `sync_turn()`

without a warning. The deadliest failure mode: total silence.

Tried as a replacement. Same silent failure. MCP tools responded "OK" but ingestion was completely dead. We never stored a single memory. Uninstalled alongside AgentMemory in the same cleanup session.

**Lesson #1:** A memory provider that fails silently is worse than no provider at all. False confidence corrupts everything.

This one looked promising on paper. Bundled with Hermes. 91.4% on the LongMemEval benchmark. Knowledge graphs, reflect synthesis — the "power pick."

Reality:

`hindsight-all`

vs `hindsight-client`

)`pg0`

) tried to download itself and hung for 177 secondsBreaking the cycle required stopping the gateway, hunting processes with `pkill -9`

, and restarting. A hard kill. For a memory plugin.

**Lesson #2:** If uninstallation requires killing processes by force, the architecture is wrong. A memory provider's lifecycle should not require a process manager.

At this point we had criteria. Real criteria, earned through pain:

We surveyed what was available:

| Provider | Verdict | Killer Flaw |
|---|---|---|
Holographic (bundled) |
Too simple |
`sync_turn()` is a no-op — no auto-ingestion |
Supermemory (bundled) |
Cloud-only | All cloud. Best benchmarks, but contradicts local-first |
Mem0 |
Double token burn | LLM-Embedded: the agent calls an LLM, Mem0 calls its OWN LLM for fact extraction. Pay twice. |
MemPalace |
Wrong platform | 96.6% LongMemEval, but built for Claude Code — not Hermes |

By [AxDSan](https://github.com/AxDSan). Posted directly to r/hermesagent by its author. The README literally says: *"The Zero-Dependency, Sub-Millisecond AI Memory System for Hermes Agents."*

What makes it different:

**In-process Python + SQLite.** No separate service. No Docker. No daemon. If the gateway process runs, memory works. There is nothing to fall out of sync *with*.

**Sub-millisecond reads.** 0.076ms. 500x faster than the previous-generation providers. You don't feel it.

**Three code paths, all verified working:**

`remember()`

when asked`sync_turn`

captures every conversation turn automatically**Installation was one command:**

```
pip install mnemosyne-memory[embeddings]
python -m mnemosyne.install
hermes memory setup  # interactive picker → select "mnemosyne"
```

No `[all]`

— that pulls ctransformers and downloads 1–4GB of GGUF models. On a 4GB machine, that's OOM territory. The `[embeddings]`

extra adds fastembed (133MB ONNX model) for semantic search, and LLM consolidation routes through your existing API key.

**After three weeks of operation:**

Every failed provider shared one architectural decision: **an external runtime with its own lifecycle.**

AgentMemory's Node.js Docker. Hindsight's pg0 Postgres + daemon. When the runtime and the gateway fell out of sync — silent failure, ghost processes, respawn loops.

Mnemosyne's in-process Python + SQLite avoids this entirely. It's the simplest thing that could possibly work — and that turns out to be the hardest thing to get right, because every other provider ships complexity as a feature.

`[all]`

is a trap.*I'm @MariaTanBoBo on X. This article was written with Hermes Agent and published via the DEV.to API — yes, an AI agent can publish articles now. The future is weird.*