Show HN: PMB – local memory for coding agents that shows if it is used

wpnews.pro

Decisions, lessons and project facts live in one SQLite file you own. Fed back to Claude Code, Cursor, Codex and Zed through MCP. Offline, no API keys, no cloud, recall in ~35 ms.

pip install pmb-ai

100% on your machineNo API keysNo cloud, no telemetry

Apache-2.0, open source

Memory that doesn't wait to be asked #

Hooks inject the right memory before the model thinks, and journal the agent's work after, no LLM call on the read path, no tool the agent has to remember to call.

1· Any agent records

2· Surfaced before it answers

Auto-recall on every prompt

Every message is classified in sub-millisecond; the matching lessons, decisions and project overview are fetched for the agent before it reasons.

Sub-millisecond async writes

The MCP tool returns instantly. SQLite first; the embed and LanceDB vector insert run on a background thread, never blocking the turn.

Hybrid recall, ranked

BM25 + dense vectors + entity graph + optional rerank, fused with Reciprocal-Rank-Fusion. One call returns the right thing, ranked.

Lessons that earn their place

Every rule is scored by whether the agent actually follows it. Useful ones get starred; ignored ones are flagged dead, so you prune what doesn't help.

Your memory, as a graph you can explore #

Every fact, decision, lesson, file and entity becomes a node, color-coded by type, sized by importance. Hover one to dim the rest, light up its neighbors, and read the full memory chunk.

0 entities ·

0 connections ·

8 clusters

Every decision, lesson and commit, newest first #

One lane per project, nodes color-coded by event type, connected by soft curves. The same journal that ships in the dashboard, written automatically as you work.

This is the actual dashboard #

A local web app served from your machine. The Map and Timeline above are live recreations, here is the real thing, rendering one project's memory.

The Map · 65,005 connections across 149 clusters, color-coded by kind

What changes when your agent remembers #

Not features, outcomes. This is what persistent memory actually does to your day.

Stop re-explaining your project

Every session starts already knowing your decisions, conventions and the bug you hit last Tuesday. No more pasting the same context into a fresh chat.

Switch tools without losing context

Claude Code, Cursor, Codex and Zed all read the same memory. Your context follows you, not your editor, so changing agents costs nothing.

Memory you can actually trust

PMB scores whether each lesson gets followed and flags the dead ones. It tells you when a memory isn't helping, so your context stays honest, not bloated.

Seven commands, then just talk to your agent #

No account, no keys, nothing leaves your machine. Inspect everything from the terminal, or open the dashboard.

35 ms hybrid recall

One command wires your agent to MCP #

Everything runs over stdio, the server is a child process of your agent. No network, no port, no token.

Claude Code

Rules appended to your agent's config automatically
Point several agents at one shared workspace
Verify the wiring with pmb doctor

Bring your own model, or run it offline #

PMB never calls an LLM on the read path. The optional summarize and graph-extract passes run on whatever you point them at, including a fully local Ollama. Your memory stays yours.

Running in 60 seconds #

Three commands, no account, no config. Then just work the way you already do.

Install

One pip install. Pure Python, runs on macOS, Linux and Windows.

pip install pmb-ai

Connect your agent

Wires PMB into your agent over MCP. Swap in cursor, codex, zed, and more.

pmb connect claude-code

Just talk to it

Work as usual, PMB records and recalls automatically. Open the dashboard any time to explore.

pmb dashboard

Files on your disk, all the way down #

Every event lives in SQLite; vectors live in LanceDB next to it. Copy them anywhere with cp. No server to trust.

Fast, local, and honest about it #

Every number here is measured on PMB's own engine and reproducible from the repo. No cloud, no LLM in the read path, no per-query cost.

Retrieval quality (recall@k)

MRR 0.774 · nDCG@10 0.816LoCoMo-10 · 997 questions · no LLM grader · cache off

Recall latency vs memory size (p50 / p95)

Warm daemon, cache off, local CPU. Real ~100-memory workspace: p50 24 ms. Cached: ~0.15 ms.

It tells you when a memory isn't helping #

Every lesson carries a surface_id. PMB tracks whether the agent actually followed it, confirmed or auto-detected from activity. Rules that get ignored are flagged dead. The ones that earn their place are starred. No vanity metrics.

Built on boring, durable pieces #

No exotic infrastructure. Local files and well-worn libraries, the kind you can still open in five years.

It tends itself #

A year in, recall is still sharp. Memory decays, archives, and dedupes on its own, and never deletes anything behind your back.

Write

Active

Read

Decay

Compact

Archived

You

Daemon

Memory flows left to right and tends itself. Hover a stage to follow the path.

one SQLite file## Straight answers

Does my code or data ever leave my machine? #

No. Everything lives in a local SQLite file with vectors in LanceDB right next to it. There are no network calls on the read path, no account and no telemetry, ever. Unplug the internet and it still works.

How is this different from RAG or a vector database? #

Two ways. Recall is hybrid, BM25 plus dense vectors plus an entity graph, fused and ranked. And it's automatic: the right memory is injected before the model thinks. You don't build a pipeline or hope the agent remembers to call a tool.

Will it slow my agent down? #

No. Recall lands in about 35 ms and writes return in under a millisecond, the embedding and vector insert happen on a background thread, so the turn is never blocked.

Which agents and operating systems are supported? #

Any MCP-aware agent: Claude Code, Cursor, Codex, Zed, Windsurf and more, wired in with one command. PMB is pure Python and tested on macOS, Linux and Windows.

What if a memory is wrong or unhelpful? #

PMB scores whether each lesson actually gets followed and flags the dead ones so you can prune them. It's the rare tool that tells you when its own memory isn't earning its place.

Is it really free? #

Yes. Apache-2.0, open source, free forever. No paid tier, no seats, no telemetry. You own the file and the code.

source & further reading

pmbai.dev — original article