Hermes Memory Providers: A Complete Breakdown for New Users

wpnews.pro

Hermes has a lot of memory options. If you're new, the choices can be overwhelming — built-in memory, 8 external providers, different costs, different architectures. This guide breaks it all down so you can make the right call for your setup.

Before we talk providers, understand that built-in memory is always on. It doesn't cost anything, requires no setup, and works out of the box.

Two files in ~/.hermes/memories/

:

File	Purpose	Char Limit
MEMORY.md
Agent's notes — environment facts, project conventions, lessons learned	2,200 chars (~800 tokens)
USER.md
User profile — your name, preferences, communication style	1,375 chars (~500 tokens)

Both are injected into the system prompt at the start of every session. The agent manages them automatically — it saves preferences you correct, environment facts it discovers, and conventions it learns.

Key details:

§

delimitersMEMORY [67% — 1,474/2,200 chars]

)For most new users, built-in memory is enough. It handles preferences, project facts, and daily workflow notes. You don't need an external provider for a personal assistant setup.

But you'll want one when:

All external providers are installed via:

hermes memory setup      # interactive picker
hermes memory status     # check what's active
hermes memory off        # disable

Or set manually in ~/.hermes/config.yaml

:

memory:
  provider: hindsight    # or any of the 8

Important: Only one external provider can be active at a time. All of them layer on top of built-in memory — they don't replace it.

Provider	Storage	Cost	Unique Angle
Hindsight
Local/Cloud	Free (local)	Knowledge graph + reflect synthesis	Highest accuracy, privacy
Holographic
Local SQLite	Free	HRR algebra + trust scoring, zero deps	Air-gapped, zero-install
OpenViking
Self-hosted	Free (AGPL)	Tiered L0/L1/L2 , 80-90% token savings	Self-hosted teams, cost optimization
Mem0
Cloud	Freemium	Server-side LLM extraction, dual memory scope	Fastest setup
Honcho
Cloud/Self	Paid (cloud) / Free (self-hosted)	Dialectic user modeling	Multi-agent, deep user understanding
ByteRover
Local/Cloud	Freemium	Knowledge tree in human-readable Markdown	Pre-compression knowledge capture
RetainDB
Cloud	Paid	Hybrid search: vector + BM25 + reranking	Production search quality
SuperMemory
Cloud	—	Web-focused memory with browser integration	Web research workflows

Only two providers have published LongMemEval scores:

Provider	Score	Model
Hindsight
91.4%
Gemini-3
Hindsight
89.0%	Open-source 120B
Mem0
67.6%	GPT-4o (LongMemEval-S variant)

Hindsight is the clear retrieval accuracy leader. Others haven't published comparable benchmarks.

The best all-around choice for most users who want local + accurate.

Stores structured knowledge — discrete facts, named entities, and relationships — not raw text chunks. Its unique hindsight_reflect

tool periodically synthesizes higher-level insights across all memories. Think of it as the agent building a personal knowledge graph over time.

Setup:  hermes memory setup → select Hindsight
        Leave blank for local daemon, or set HINDSIGHT_API_KEY for cloud
Tools:  hindsight_recall, hindsight_retain, hindsight_reflect
Cost:   Free (local PostgreSQL daemon) / Cloud available for teams

Best if: You want the highest retrieval accuracy, need structured knowledge, or handle privacy-sensitive data.

Zero dependencies. Nothing leaves your machine. Literally two tools and done.

Uses Holographic Reduced Representations (HRR) — memories stored as superposed complex-valued vectors. Recall is algebraic, not similarity-based. A trust-scoring mechanism causes confirmed memories to gain weight and contradicted ones to decay over time.

Setup:  hermes memory setup → select Holographic. That's it. No API keys.
Tools:  2 tools (minimal by design)
Cost:   Free. Local SQLite. Period.

Best if: You're in an air-gapped environment, hate external dependencies, or want self-correcting memory that learns what's trustable.

The token-saver. Self-hosted context database from ByteDance.

Its filesystem-style hierarchy with tiered is the standout feature:

This means 80-90% token cost reduction vs. full context every turn. Auto-extracts memories into 6 categories: profile, preferences, entities, events, cases, patterns.

Setup:  pip install openviking
        openviking-server
        hermes memory setup → select OpenViking
        Set OPENVIKING_ENDPOINT=http://localhost:1933
Tools:  viking_search, viking_read, viking_browse, viking_remember, viking_add_resource
Cost:   Free (AGPL-3.0, self-hosted)

Best if: You're running at scale, want self-hosted infrastructure, or need to minimize token costs.

The "just make it work" option. 30 seconds to running.

Server-side LLM extraction means Mem0's infrastructure decides what to keep. Includes a circuit breaker so memory failures don't block agent responses. Dual memory scope (session + user) means it separates short-term context from long-term facts.

Setup:  hermes memory setup → select Mem0
        Set MEM0_API_KEY=your-key
Tools:  mem0_add, mem0_search, mem0_get_all
Cost:   Freemium (free tier available)

Best if: You want the fastest setup, don't want to self-host, and are okay with cloud storage. Good starting point — you can always migrate later.

The philosopher. Builds a model of how you think, not just what you know.

Dialectic user modeling captures reasoning patterns, communication style, and decision-making tendencies over time. Two-layer context injection with configurable cadences for refreshes. Supports multi-agent setups with separate AI peers per Hermes profile.

Setup:  hermes memory setup → select Honcho
        Set HONCHO_API_KEY=your-key
Tools:  honcho_profile, honcho_search, honcho_context, honcho_reasoning, honcho_conclude
Cost:   Paid (cloud) / Free (self-hosted, AGPL-3.0)

⚠️ Licensing note: OSS is AGPL v3.0. Self-hosting in a networked app requires releasing your source under AGPL. Using managed cloud avoids this.

Best if: You're building a personal assistant that should deepen its model of you over time, or running multi-agent systems with shared user context.

Your knowledge, stored as readable Markdown. No black boxes.

Hierarchical knowledge tree stored in .brv/context-tree/

as human-readable Markdown files. Unique pre-compression extraction hook fires before Hermes compresses long conversations, capturing knowledge before context gets summarized away.

Setup:  hermes memory setup → select ByteRover
Tools:  byterover_search, byterover_list, byterover_forget
Cost:   Freemium

Best if: You want full visibility into stored memory, or need to capture knowledge from long conversations before compression loses it.

Search nerd's pick. Hybrid vector + BM25 + reranking.

Combines multiple retrieval strategies for the highest-quality search results. Vector similarity catches semantic matches, BM25 catches exact keyword matches, and reranking puts the best results on top.

Setup:  hermes memory setup → select RetainDB
Tools:  retaindb_search, retaindb_store
Cost:   Paid

Best if: Retrieval quality is your top priority and you're willing to pay for it.

Web research workflows. Browser-integrated memory.

Designed for memory that extends into the browser — captures and retrieves web content as part of your knowledge base.

Setup:  hermes memory setup → select SuperMemory
Cost:   See supermemory.ai pricing

Best if: Your workflow involves heavy web research and you want persistent memory of online content.

Tier	Providers	Notes
Free, local
Holographic, Hindsight (local), OpenViking	No API keys, no cloud. Holographic is the easiest pick.
Free tier / freemium
Mem0, ByteRover	Start free, pay for higher limits
Paid cloud
Honcho, RetainDB, SuperMemory	Production features, team support
Always free (built-in)
MEMORY.md + USER.md	No setup, always active, 2200 + 1375 char limits

Just getting started?

Stick with built-in memory. It covers 80% of use cases. Add an external provider only when you hit its limits.

Want the best free local experience?

Hindsight (local daemon). Best benchmarks, nothing leaves your machine, structured knowledge graph.

Want zero config?

Hogrpghic. Pick it in hermes memory setup

and you're done. No API keys, no servers.

Want the easiest cloud setup?

Mem0. 30 seconds, free tier, hands-off extraction.

Running multi-agent or want deep user modeling?

Honcho. The dialectic reasoning is genuinely different from every other provider.

Care about token costs at scale?

OpenViking's tiered will save you 80-90% on tokens.

Switching is straightforward:

hermes memory setup      # pick new provider
hermes memory status     # confirm it's active

Your built-in memory (MEMORY.md, USER.md) stays intact regardless of which external provider you use. Note that external providers store data in their own backends — switching providers means starting fresh with the new one's knowledge base. There's no automated migration between providers yet.

Drop them in the comments. I'm happy to help you pick the right setup for your use case.

source & further reading

dev.to — original article What do you use AI for the most? I Poked a 10-Year-Old Chat Protocol With a Stick Ever wish you had a personal drill sergeant or a dramatic movie narrator to hype you up before a big task? I built an app for that.

Hermes Memory Providers: A Complete Breakdown for New Users

Run your AI side-project on zahid.host