cd /news/developer-tools/neonmem-0-9-7-is-out Β· home β€Ί topics β€Ί developer-tools β€Ί article
[ARTICLE Β· art-38097] src=dev.to β†— pub= topic=developer-tools verified=true sentiment=↑ positive

Neonmem 0.9.7 is out.

Neonmem 0.9.7 introduces a two-level importer that separates folders and files into a searchable knowledge pool and agent chats into typed memories, using IBM Granite-30M embeddings via ONNX Runtime for offline, grounded recall. The update also adds persistent tags, deduplication, and optional AES-256-GCM encryption, all running locally without cloud dependencies.

read3 min views1 publishedJun 24, 2026

#

  1. A two-level importer β€” two kinds of "stuff," treated differently

The big change. Your project doesn't come in one shape, so the importer no longer flattens

it into one pile:

Folders & files β†’ a searchable knowledge pool. Your docs, code and notes are vectorised into a lossless, deduplicated facts pool β€” the same fact stated three ways becomes one fact, every source kept. Nothing is summarised away. #

Agent chats β†’ typed memories. Point Neonmem at a Claude (or other agent) transcript and it pulls out only what's worth keeping β€” the decisions, dead-ends and rules β€” as clean, typed memories. A decision is stored as a decision; a dead-end stays a warning. The process-narration ("I read the file…", "please check…") is dropped. #

Links become knowledge. If a chat references a file on disk, that file is pulled into the pool automatically, with a memory that points back to it.

The result is labelled honestly in the UI: Facts loaded (the pool) and

Memories created (the kept decisions).

#

  1. Grounded, offline recall

0.9.7 replaces the old embedder with IBM Granite-30M, run as a fused fp16 ONNX graph

through ONNX Runtime:

  • Database-class retrieval quality on any CPU β€” no GPU, no PyTorch, no API key, no cloud.
  • Every prompt walks memory in order β€” reflexes β†’ short-term β†’ long-term β†’ facts pool β€” and answers from what you actually imported, or honestly says it doesn't know.

This is the headline behaviour: ask "what is ARC?" and you get your definition from

your docs β€” not the textbook expansion the model would otherwise guess. A memory that's

occasionally wrong is worse than no memory at all, so the rule is: answer from the user's

sources, or abstain. Never invent.

#

  1. Tags that stick

Tag an import with a topic (e.g. Specific API

) and Neonmem mints one clean, canonical

memory for it, linked back to the source β€” even when your docs never write the term

verbatim, as long as they clearly describe it. If the corpus genuinely has nothing on a

tag, it's left out rather than faked.

#

  1. Clean by construction

Memories follow one golden rule: a single concise statement (ARC β€” your provisioning

platform

) linked to the full source, not a messy pile of raw chunks. Chat capture

deduplicates through the same facts layer, so re-importing a conversation never doubles up.

#

  1. One durable cartridge
  • The importer keeps the full source corpus inside the cartridge (content-addressed + compressed) β€” one file replaces the scattered docs and transcripts, and the facts are always rebuildable from ground truth.

Opt-in AES-256-GCM encryption at rest β€” your whole corpus as a private vault.

  • Imported knowledge is long-term and survives reopening the project.

#

Built on (all open, permissively licensed)

Embeddings: IBM Granite-30M (Apache-2.0) via ONNX Runtime (MIT). Vector search: FAISS (MIT). Agent integration: the Model Context Protocol. Full attributions ship

with every download. No third-party LLM, nothing phones home.

#

Get it

Windows (signed installer + portable) and Linux (AppImage); macOS on the way. Local, private, and free for personal use.

β†’ neonmem.com Import a project, then ask it the one thing your assistant always gets confidently wrong

about your codebase. That question is the whole test.

── more in #developer-tools 4 stories Β· sorted by recency
── more on @neonmem 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/neonmem-0-9-7-is-out] indexed:0 read:3min 2026-06-24 Β· β€”