{"slug": "neonmem-0-9-7-is-out", "title": "Neonmem 0.9.7 is out.", "summary": "Neonmem 0.9.7 introduces a two-level importer that separates folders and files into a searchable knowledge pool and agent chats into typed memories, using IBM Granite-30M embeddings via ONNX Runtime for offline, grounded recall. The update also adds persistent tags, deduplication, and optional AES-256-GCM encryption, all running locally without cloud dependencies.", "body_md": "##\n1. A two-level importer — two kinds of \"stuff,\" treated differently\n\nThe big change. Your project doesn't come in one shape, so the importer no longer flattens\n\nit into one pile:\n\n-\n**Folders & files → a searchable knowledge pool.** Your docs, code and notes are\nvectorised into a lossless, deduplicated facts pool — the same fact stated three ways\nbecomes one fact, every source kept. Nothing is summarised away.\n-\n**Agent chats → typed memories.** Point Neonmem at a Claude (or other agent) transcript\nand it pulls out only what's worth keeping — the **decisions, dead-ends and rules** — as\nclean, typed memories. A decision is stored as a decision; a dead-end stays a warning.\nThe process-narration (\"I read the file…\", \"please check…\") is dropped.\n-\n**Links become knowledge.** If a chat references a file on disk, that file is pulled into\nthe pool automatically, with a memory that points back to it.\n\nThe result is labelled honestly in the UI: **Facts loaded** (the pool) and\n\n**Memories created** (the kept decisions).\n\n##\n2. Grounded, offline recall\n\n0.9.7 replaces the old embedder with **IBM Granite-30M**, run as a fused **fp16 ONNX** graph\n\nthrough **ONNX Runtime**:\n\n- Database-class retrieval quality on\n**any CPU** — no GPU, no PyTorch, no API key, no cloud.\n- Every prompt walks memory in order —\n**reflexes → short-term → long-term → facts pool** —\nand answers from what you actually imported, or **honestly says it doesn't know**.\n\nThis is the headline behaviour: ask *\"what is ARC?\"* and you get **your** definition from\n\n**your** docs — not the textbook expansion the model would otherwise guess. A memory that's\n\noccasionally wrong is worse than no memory at all, so the rule is: answer from the user's\n\nsources, or abstain. Never invent.\n\n##\n3. Tags that stick\n\nTag an import with a topic (e.g. `Specific API`\n\n) and Neonmem mints **one clean, canonical**\n\nmemory for it, linked back to the source — *even when your docs never write the term*\n\nverbatim, as long as they clearly describe it. If the corpus genuinely has nothing on a\n\ntag, it's left out rather than faked.\n\n##\n4. Clean by construction\n\nMemories follow one **golden rule**: a single concise statement (`ARC — your provisioning`\n\nplatform\n\n) linked to the full source, not a messy pile of raw chunks. Chat capture\n\ndeduplicates through the same facts layer, so re-importing a conversation never doubles up.\n\n##\n5. One durable cartridge\n\n- The importer keeps the\n**full source corpus** inside the cartridge (content-addressed +\ncompressed) — one file replaces the scattered docs and transcripts, and the facts are\nalways rebuildable from ground truth.\n-\n**Opt-in AES-256-GCM encryption at rest** — your whole corpus as a private vault.\n- Imported knowledge is long-term and\n**survives reopening** the project.\n\n##\nBuilt on (all open, permissively licensed)\n\nEmbeddings: **IBM Granite-30M** (Apache-2.0) via **ONNX Runtime** (MIT). Vector search:\n\n**FAISS** (MIT). Agent integration: the **Model Context Protocol**. Full attributions ship\n\nwith every download. No third-party LLM, nothing phones home.\n\n##\nGet it\n\nWindows (signed installer + portable) and Linux (AppImage); macOS on the way. Local,\n\nprivate, and free for personal use.\n\n→ [neonmem.com](https://neonmem.com)\n\n*Import a project, then ask it the one thing your assistant always gets confidently wrong*\n\nabout your codebase. That question is the whole test.", "url": "https://wpnews.pro/news/neonmem-0-9-7-is-out", "canonical_source": "https://dev.to/neonmem_dev/neonmem-097-is-out-59cp", "published_at": "2026-06-24 16:34:22+00:00", "updated_at": "2026-06-24 17:09:42.103797+00:00", "lang": "en", "topics": ["developer-tools", "machine-learning", "natural-language-processing", "ai-infrastructure"], "entities": ["Neonmem", "IBM Granite-30M", "ONNX Runtime", "FAISS", "Model Context Protocol", "Claude"], "alternates": {"html": "https://wpnews.pro/news/neonmem-0-9-7-is-out", "markdown": "https://wpnews.pro/news/neonmem-0-9-7-is-out.md", "text": "https://wpnews.pro/news/neonmem-0-9-7-is-out.txt", "jsonld": "https://wpnews.pro/news/neonmem-0-9-7-is-out.jsonld"}}