#
- A two-level importer β two kinds of "stuff," treated differently
The big change. Your project doesn't come in one shape, so the importer no longer flattens
it into one pile:
Folders & files β a searchable knowledge pool. Your docs, code and notes are vectorised into a lossless, deduplicated facts pool β the same fact stated three ways becomes one fact, every source kept. Nothing is summarised away. #
Agent chats β typed memories. Point Neonmem at a Claude (or other agent) transcript and it pulls out only what's worth keeping β the decisions, dead-ends and rules β as clean, typed memories. A decision is stored as a decision; a dead-end stays a warning. The process-narration ("I read the fileβ¦", "please checkβ¦") is dropped. #
Links become knowledge. If a chat references a file on disk, that file is pulled into the pool automatically, with a memory that points back to it.
The result is labelled honestly in the UI: Facts loaded (the pool) and
Memories created (the kept decisions).
#
- Grounded, offline recall
0.9.7 replaces the old embedder with IBM Granite-30M, run as a fused fp16 ONNX graph
through ONNX Runtime:
- Database-class retrieval quality on any CPU β no GPU, no PyTorch, no API key, no cloud.
- Every prompt walks memory in order β reflexes β short-term β long-term β facts pool β and answers from what you actually imported, or honestly says it doesn't know.
This is the headline behaviour: ask "what is ARC?" and you get your definition from
your docs β not the textbook expansion the model would otherwise guess. A memory that's
occasionally wrong is worse than no memory at all, so the rule is: answer from the user's
sources, or abstain. Never invent.
#
- Tags that stick
Tag an import with a topic (e.g. Specific API
) and Neonmem mints one clean, canonical
memory for it, linked back to the source β even when your docs never write the term
verbatim, as long as they clearly describe it. If the corpus genuinely has nothing on a
tag, it's left out rather than faked.
#
- Clean by construction
Memories follow one golden rule: a single concise statement (ARC β your provisioning
platform
) linked to the full source, not a messy pile of raw chunks. Chat capture
deduplicates through the same facts layer, so re-importing a conversation never doubles up.
#
- One durable cartridge
- The importer keeps the full source corpus inside the cartridge (content-addressed + compressed) β one file replaces the scattered docs and transcripts, and the facts are always rebuildable from ground truth.
Opt-in AES-256-GCM encryption at rest β your whole corpus as a private vault.
- Imported knowledge is long-term and survives reopening the project.
#
Built on (all open, permissively licensed)
Embeddings: IBM Granite-30M (Apache-2.0) via ONNX Runtime (MIT). Vector search: FAISS (MIT). Agent integration: the Model Context Protocol. Full attributions ship
with every download. No third-party LLM, nothing phones home.
#
Get it
Windows (signed installer + portable) and Linux (AppImage); macOS on the way. Local, private, and free for personal use.
β neonmem.com Import a project, then ask it the one thing your assistant always gets confidently wrong
about your codebase. That question is the whole test.