{"slug": "show-hn-fernme-agent-memory-that-updates-with-zero-llm-calls", "title": "Show HN: FERNme – agent memory that updates with ~zero LLM calls", "summary": "FERNme, a new user-owned memory layer for AI agents, updates memories with zero LLM calls using a Hebbian co-occurrence rule, keeping token costs flat and enabling users to see, edit, and own their data. The system achieves a 16% conversion lift in simulated storefronts and is designed to work across websites, desktop, and mobile, with privacy features including tamper-evident logs and provable right-to-be-forgotten.", "body_md": "**A user-owned, near-zero-LLM memory layer for AI agents. It learns each person from their behavior — including how they talk and feel — stays token-flat forever, and lets people see, edit, and own what's remembered. The engine is substrate-agnostic: it remembers wherever an agent acts — websites today (shopping, support, booking, healthcare, tutoring, gov), desktop and mobile next.**\n\n*Cheap to write · flat to read · interpretable by design · owned by the user*\n\nMost agent memory is **written by an LLM on every turn** (expensive, hallucination-prone), **evaluated on question-answering** (not actions), and **assumes a single user**. FERNme is built for the opposite world — agents that *act* for *many* people, in any domain (a sale, a booking, a resolved ticket, a completed lesson, a kept appointment — \"outcome\" is whatever the goal is). It starts where agents already act today — websites — and the same user-owned memory is designed to extend to desktop and mobile. Each user is a sparse, fuzzily-weighted node in a per-site graph; edges update by a **Hebbian co-occurrence rule with zero LLM calls**, retrieval is **spreading activation**, and the prompt-facing \"card\" stores only **deviations from a population prior**. The result: per-turn cost stays flat as a profile grows for years, the user can read and correct their own memory, and the same engine assembles — only with the user's consent — into a cross-site **supernode** they fully control.\n\n🪶 Zero-LLM writes |\nMemory updates are arithmetic on a graph — 0 LLM calls per interaction vs. ~2 for extraction-based memory. No write-time cost, no write-time hallucination. |\n📉 Flat token cost forever |\nThe prompt card holds ~25 tokens whether it's a visitor's first day or fifth year. A full-history baseline is 77× larger by 120 interactions. |\n🧠 Strong in every regime |\nTies a frequency counter on static recall, beats it 0.72 → 0.13 on drift, and wins on context (0.62 → 0.51). Decay + spreading activation unify stability and adaptivity. |\n🪟 Glass-box & user-owned |\nEvery preference is visible and editable. People fix what's wrong, delete everything, or export it. Privacy becomes a feature, not a liability. |\n🏬 Built for outcomes |\nEvaluated by conversion, not QA. A simulated storefront shows +16% conversion lift vs. non-personalized recommendations. |\n🧩 User-owned supernode |\nSign in across sites → your memories assemble like Lego into one profile you control, default-deny, sensitive data walled off. Not surveillance — the mirror image of it. |\n🎚 Cost/quality dial |\nOne engine, a `memory_mode` switch: free key-less `pure` by default, opt-in `gated` /`offline` LLM enrichment when you need Mem0-grade nuance — pay only for the compute you use. |\n🔐 Verifiable & unlearnable |\nEvery action is logged in a tamper-evident HMAC chain the user can replay to detect any alteration; `forget_everywhere` wipes the profile and unlearns the person from the population prior — provable right-to-be-forgotten. |\n🛡 Injection-proof by design |\nWrites are arithmetic, not LLM extraction, so page/user text can't be \"talked into\" becoming a belief — tested that injected instructions never enter memory. |\n🧠 Private collective intelligence |\nNew users benefit from crowd patterns on turn one (cold-start from a population prior), with k-anonymity + differential privacy so no individual leaks. A network-effect moat single-user memories can't have. |\n🗣 Style & mood memory |\nLearns how each person communicates (terse/verbose, formal/casual, energy) and tracks their mood with trend detection, so the agent can match tone and notice when someone's frustration is rising — in any domain. |\n🎯 Outcome-learning, any goal |\nMemory is reinforced by results — not just recall. `record_outcome(success)` strengthens what worked and weakens what backfired, where \"success\" is any goal (purchase, booking, resolved ticket, completed lesson…). |\n🔍 Explainable |\nAsk `why(user, attr)` — get the evidence (observations + good/bad outcomes + dates). No black box. |\n🔌 Deployable plumbing (research preview; harden per SECURITY.md) |\nSQLite or Postgres (tested on real PG 16), REST + MCP servers, consent gating, injection-safe writes, proactive triggers — all tested. |\n\nHonest scope:the numbers below are onsynthetic or LLM-authoreddata, not real users. They validate themechanismand surface failures; a real-human pilot is the pending next step. The Mem0 (LLM) head-to-head needs an API key and is not yet run.\n\nA sample of 16 of 92 third-person profiles (ChatGPT-authored), read as **prose only** and\nremembered agentically, then scored against hidden answer keys:\n\n| metric | result |\n|---|---|\n| preference coverage vs. hidden key | 75% |\n| communication style — formality | 100% |\n| mood sign / mood arc | 94% / 100% |\n| preference drift detected | 94% |\n| injection attempts ignored | 100% |\n| note → card compression | 7.3× |\n\n*(The \"agent\" here is an LLM reading prose, so these reflect agent + engine together — the\nengine is solid; the extraction quality is the agent's.)*\n\nReproduce:\n\n`python -m fernme.eval.cost_variance`\n\n·`... quality`\n\n·`... drift`\n\n·`... context`\n\n·`... ablation`\n\n·`... pilot`\n\n**Cost** — per-turn memory tokens vs. profile size (5 seeds):\n\n| metric | FERNme | baseline |\n|---|---|---|\n| card size | 24.9 ± 0.5 tokens (flat) |\nfull history grows linearly |\n| at 120 interactions | 1× |\n77× ± 1.3 larger |\n| LLM calls per write | 0 |\n~2 (extraction memory) |\n\n**Recall quality** — precision@5 vs. ground-truth preferences (5 seeds × 40 users):\n\n| regime | 🌿 FERNme | frequency | recency |\n|---|---|---|---|\n| static recall | 0.74 | 0.74 | 0.47 |\ndrift (taste shifts) |\n0.72 ✅ |\n0.13 ❌ | 0.59 |\ncontext (precision@3) |\n0.62 ✅ |\n0.51 (blind) | — |\n\nThe headline:FERNme is theonlymethod strong everywhere. Frequency can't forget (fails drift); recency is noisy (fails static). FERNme's decay + spreading activation get both.\n\n**Cold-start ablation** — population prior gives **+0.06 precision@5 at turns 1–3**, washing out by turn 10 (a real but modest, cold-start-only benefit).\n\n**Cost / quality Pareto** (`python -m fernme.eval.pareto`\n\n) — measured FERNme recall &\ntokens, modeled LLM nuance & price (assumptions in-file). Per 1,000 interactions:\n\n| strategy | quality | $/1k | vs Mem0 |\n|---|---|---|---|\n| FERNme-pure | 0.52 | $0.008 | 122× cheaper |\nFERNme+gated |\n0.66 | $0.023 | 42× cheaper |\nFERNme+offline |\n0.73 | $0.104 | 9× cheaper |\n| full-history@120 | 0.82 | $0.59 (grows) | — |\n| Mem0-style | 0.82 | $0.95 | 1× |\n\nFERNme+gated/offline sit on the efficient knee: **~80–90% of the LLM-ceiling quality\nat 1–2 orders of magnitude less cost.** (Modeled assumptions; shape is the point.)\n\n**Simulated outcome pilot** — fake storefront, learn-from-behavior shoppers: **+16% relative conversion lift** over a popularity baseline; tied at visit 1 (cold start), pulling ahead as it learns, recovering through a mid-pilot taste drift.\n\nFERNme ships **one core** with a deployment-level switch — `FernService(memory_mode=...)`\n\n.\nThe default is free, key-less, and tested; LLM modes are opt-in and pluggable.\n\n| mode | LLM use | cost | status |\n|---|---|---|---|\n(default)`pure` |\nnone | cheapest, flat | ✅ tested, key-less |\n`gated` |\none small call only on novel free-text |\n~tiny | 🧪 experimental — needs a model |\n`offline` |\nbatched `consolidate()` enrichment, off the hot path |\n~tiny, amortized | 🧪 experimental — needs a model |\n\n- A\n**pluggable tagger**(`tagging.py`\n\n) does the LLM work; you pass`llm_fn`\n\n, optionally constrained to a**controlled vocabulary**(the real consistency lever across models). - The hot write path stays\n**LLM-free in every mode**; gated spends a call only when the deterministic mapping finds nothing, and`svc.llm_calls`\n\ncounts every invocation for cost transparency. - See the cost/quality Pareto above for where each mode lands.\n*Honest note:*the gated/ offline quality is**modeled** until run against a real model — the wiring is tested here with a mock LLM, not validated for quality.\n\nFERNme's edge isn't the mechanism (that's now a crowded 2026 category) — it's competing\non dimensions single-user, vendor-owned, recall-optimized systems **structurally can't**.\n\n| # | Dimension | Status |\n|---|---|---|\n| 9 | Communication-style & mood memory |\n✅ built + tested |\n| 2 | Outcome-learning for any goal (reinforce on results) |\n✅ built + tested |\n| 8 | Explainable provenance (`why` ) |\n✅ built + tested |\n| 1 | Private collective priors (network-effect cold-start; k-anonymity + bounded-mean DP) |\n✅ built + tested |\n| 4 | Verifiable, cryptographic data ownership (tamper-evident HMAC chain, cascading unlearning) |\n✅ built + tested |\n| 7 | Multi-timescale memory (fast context vs. slow identity) |\n✅ built + tested |\n| 6 | Self-tuning forgetting (learn decay from outcomes; adapts to drift) |\n✅ built + tested |\n| 5 | Injection-resistant by construction (deterministic writes can't be talked into beliefs) |\n✅ built + tested |\n| 3 | Open user-owned memory protocol (portable across any agent, with consent) |\n◑ spec stage |\n\nThese are deliberately the things HippoGraph et al. can't follow: they're single-user (no collective priors), vendor-owned (no user-owned protocol), and recall-optimized (no outcome loop). Built in honest, tested slices — research-dependent ones are marked.\n\n``` php\nflowchart TD\n    V[Visitor on a website] -->|prompt + action| API[FERNme Service]\n    API --> CONSENT{consent?}\n    CONSENT -->|no| STOP[blocked]\n    CONSENT -->|yes| ENGINE\n    subgraph ENGINE[Engine - no LLM in the write path]\n      W[Hebbian write + decay] --> G[(Per-site preference graph<br/>fuzzy 0-9 edges)]\n      G --> R[Spreading-activation retrieval]\n      R --> CARD[Token-minimal card ~25 tok]\n      PRIOR[Population prior<br/>differential encoding] --> R\n    end\n    CARD --> AGENT[Agent: recommend / act]\n    G --> CAB[(Cabinet: raw event log)]\n    API --> STORE[(SQLite or Postgres<br/>multi-tenant)]\n    API --> GLASS[🪟 Glass-box editor]\n    API -.user signs in.-> SUPER[User-owned Supernode<br/>cross-site, default-deny]\n```\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7794.PNG)*Why FERNme — adaptive local memory instead of expensive RAG/vector retrieval in the loop.*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7796.PNG)*What makes it different — near-zero-LLM, deterministic-first, Hebbian, fuzzy, memory cards, action-aware, user-owned.*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7797.PNG)*How memory grows — new event → connect → strengthen → decay → update the card (Hebbian learning).*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7799.PNG)*The fuzzy Hebbian graph — sparse, weighted (0–9) edges; nodes for users, preferences, topics, goals.*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7784.PNG)*The LLM gate — an exception, not the default. Most events are handled deterministically; the LLM is a rare fallback when uncertain.*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7802.PNG)*The memory card — a bounded, interpretable, token-minimal summary of what matters.*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7781.PNG)*Action-aware learning — good outcomes strengthen connections, bad outcomes weaken them.*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7780.PNG)*The road ahead — today's local memory; tomorrow's recursive organization and user-owned supernode (roadmap, not yet built).*\n\n[\n](/mirkofr/FERNme/blob/main/explanation%20of%20fern/IMG_7788.PNG)*Full architecture: ingestion bridge → namespaced vocabulary → fuzzy Hebbian graph → memory card → agent, with the LLM gate only when uncertain.*\n\n```\npip install -e \".[dev,api]\"\n\npython run_demo.py                      # cold-start → learning → glass-box edit\npython supernode_demo.py                # one person, three sites, one owned profile\npytest -q                               # 88 tests (engine, store, supernode, safety, auth…)\n\n# experiments\npython -m fernme.eval.drift               # FERNme beats a frequency counter when tastes change\npython -m fernme.eval.pilot               # +16% simulated conversion lift\n\n# run it live\nFERNME_API_KEY=secret uvicorn fernme.api.rest:app --port 8077   # REST API (docs at /docs)\nopen http://localhost:8077/ui                               # glass-box memory editor\nopen http://localhost:8077/graph                            # your memory as a graph — focus by site / PC / phone\npython -m fernme.api.mcp_server                               # MCP server for agents/Claude\n```\n\n🗄\n\nStorage:defaults to`~/.fernme/fernme.db`\n\n(SQLite). For production use`PostgresStore`\n\n— same interface, tested against a real Postgres 16. Keep SQLite off cloud-synced folders.\n\n**Engine**— saturating Hebbian write (no LLM), ACT-R decay, spreading activation, token-minimal card.** Population prior**— IDF cold-start; differential (deviation-only) storage is enforced by an explicit`prune_to_prior`\n\npass (redundant edges read through to the prior).**Stores**—`SQLiteStore`\n\n(zero-setup) and`PostgresStore`\n\n(tested vs real PG 16), one interface.**Ingestion bridge**— a per-site** catalog**(item_id->tags) plus a** controlled, namespaced vocabulary**(`vocabulary.py`\n\n) that canonicalizes every tag (catalog, free text, or LLM) to one form (`pref:`\n\n,`topic:`\n\n,`goal:`\n\n,`context:`\n\n) so the same concept never drifts across months. Deterministic by default; gated-LLM only for novel free text.*This is the product-critical layer — and the foundation a future recursive/region organization would group on.***The Cabinet**— append-only event log with`recall()`\n\nfor specific facts.**Supernode**(`supernode.py`\n\n+`auth.py`\n\n) — user-owned cross-site profile, built by**sign-in**(verified token → opaque person id), default-deny scoped views, sensitive categories walled off.** Proactive triggers**— due-to-reorder + fading-favorite nudges.** Safety**— event tags treated as untrusted data: injection-pattern dropping, size/value caps.** Interfaces**— REST (`/observe /card /recall /edit /export /delete /triggers …`\n\n) + MCP tools + a**glass-box web UI**(editor at`/ui`\n\n, cross-surface memory graph at`/graph`\n\n— one memory, focusable by site / PC / phone).**Governance**— consent-gated everywhere; export & right-to-be-forgotten built in.\n\nFERNme is a **different category** from conversational memories — it's a per-user *preference* graph evaluated by *actions*, not a QA memory. Don't benchmark it on LoCoMo; that's the wrong axis.\n\n| 🌿 FERNme | Mem0 | Zep/Graphiti | Letta | MemOS | |\n|---|---|---|---|---|---|\n| Write | no LLM |\nLLM | LLM → KG | LLM-paged | LLM |\n| Retrieval | spreading activation | vector | graph+time | OS paging | hybrid |\n| Eval axis | outcomes |\nQA | temporal QA | long-horizon | QA |\n| User-owned + glass-box | ✅ |\n– | – | – | – |\n| Multi-tenant per-site | ✅ |\npassport | – | – | – |\n\n**Leads on:** write cost, interpretability, per-site user-ownership/consent. **Honestly behind on:** nuanced/causal preferences (LLM extraction wins), benchmark credibility, ecosystem & distribution.\n\n✅ **Done & tested (88 tests):** engine, SQLite + real-Postgres stores, supernode + sign-in, triggers, safety, REST/MCP, glass-box UI + memory-graph view, and the full results suite above.\n\n🚧 **Still open (genuinely needs the outside world):**\n\n- A\n**real-human per-site pilot**— only live users close the loop a simulator can't. - The\n**Mem0 (LLM) head-to-head**— harness wired; run locally with`OPENAI_API_KEY`\n\n. **Embeddings** for context→attribute matching; offline LLM catalog enrichment for messy inputs.**Desktop & mobile surfaces**— the engine is substrate-agnostic; web ingestion ships today, desktop/mobile adapters are on the roadmap. The user-owned**supernode** is the bridge that assembles them, with consent, into one cross-surface profile.\n\nEvery claim above is backed by a test or a reproducible experiment. Where a result is simulated, it says so — a simulator proves the\n\nmechanism, not real-world behavior.\n\n```\nfernme/\n  core/      graph types · fuzzy 0–9 edges · event record\n  write/     event→attr mapping (no LLM) · Hebbian update · decay\n  retrieve/  base-level + spreading activation · token-minimal card\n  prior/     population prior · differential encoding · IDF cold-start\n  store/     sqlite_store · postgres_store (one interface)\n  supernode.py · auth.py · triggers.py · safety.py · service.py\n  api/       rest.py (FastAPI) · mcp_server.py · web/glassbox.html · web/graph.html\n  eval/      simulator · cost · quality · drift · context · ablation · pilot\ntests/       88 tests   ·   *_demo.py walkthroughs\n```\n\nApache-2.0, © 2026 Acquilab Inc. — see [LICENSE](/mirkofr/FERNme/blob/main/LICENSE) and [NOTICE](/mirkofr/FERNme/blob/main/NOTICE). Security notes in [SECURITY.md](/mirkofr/FERNme/blob/main/SECURITY.md); the name is a working codename (see [NAMING.md](/mirkofr/FERNme/blob/main/NAMING.md)).\nIf you use FERNme in research, please cite it via [CITATION.cff](/mirkofr/FERNme/blob/main/CITATION.cff).", "url": "https://wpnews.pro/news/show-hn-fernme-agent-memory-that-updates-with-zero-llm-calls", "canonical_source": "https://github.com/mirkofr/FERNme", "published_at": "2026-06-20 23:34:03+00:00", "updated_at": "2026-06-21 00:07:32.212239+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "ai-ethics", "ai-products", "ai-research"], "entities": ["FERNme", "Mem0"], "alternates": {"html": "https://wpnews.pro/news/show-hn-fernme-agent-memory-that-updates-with-zero-llm-calls", "markdown": "https://wpnews.pro/news/show-hn-fernme-agent-memory-that-updates-with-zero-llm-calls.md", "text": "https://wpnews.pro/news/show-hn-fernme-agent-memory-that-updates-with-zero-llm-calls.txt", "jsonld": "https://wpnews.pro/news/show-hn-fernme-agent-memory-that-updates-with-zero-llm-calls.jsonld"}}