Astrum Verum – A Vector Symbolic cognitive memory that beats RAG

Astrum Verum, an open-source research project, has released a vector symbolic memory architecture called CognitiveMemory that outperforms standard Retrieval-Augmented Generation (RAG) on structured fact recall. The system uses Vector Symbolic Architectures to bind roles to fillers, enabling it to distinguish between role-swapped facts like "Alice trusts Bob" and "Bob trusts Alice" with perfect accuracy, while cosine-based RAG systems score only 0.600 on ambiguous triple pairs. The project's Phase 2 engine provides error-correcting, associative retrieval that eliminates hallucinations on directional queries, such as correctly returning "None" when asked who killed Alice after storing "Alice killed Bob.

Composition-episodic cognitive memory for AI agents — and an honest record of how it got here. Astrum Verum is a research project containing two distinct phases of memory architecture development. It started as an attempt to organize memory on perfect geometric lattices Phase 1 , but when that proved insufficient for structural recall, it pivoted to Vector Symbolic Architectures Phase 2 . Both phases ship in this repository. Phase 1 is kept as a documented historical mockup. Phase 2 is the working, validated engine that powers the actual AI agent. Read the new mathematical paper: Why VSA Works for Large-Scale Memory Solving Capacity Collapse & Decoding Hallucinations Full story, math and results: . docs/astrum verum design.md CognitiveMemory is the name of the active VSA memory layer. CognitiveMemory — Vector-Symbolic Associative Memory VSAM , composition-episodic. Retrieval isassociative by meaning / a partial or noisy cue andstructural by role — “who did what to whom” , not by an exact key.Zero persona-prompt: pure memory, not a personality— it returns what is stored. associative → unlike a key-value store no exact key needed ; compositional → unlike a plain vector DB: on role-swapped facts “A loves B” vs “B loves A” cosine sits at 0.5 chance , CognitiveMemory at 1.0 . Names: project Astrum Verum · memory engine CognitiveMemory . Flat vector search embed → cosine/HNSW is excellent at similarity but blind to structure : "Alice trusts Bob" and "Bob trusts Alice" have the same word bag, so cosine cannot tell them apart. Astrum Verum's VSA layer binds roles to fillers, so it can — and it recovers facts from corrupted/partial cues like an attractor. The "Alice and Bob" RAG failure: If a standard RAG memory stores "Alice killed Bob" and you ask "Who killed Alice?", it often hallucinates "Bob" because it just retrieves proximity tokens. Astrum Verum parses kill Alice, Bob mathematically. When asked kill ?, Alice , it yields None . Zero hallucination. Headline result reproducible : on triples an LLM extracted from real text, with genuine role ambiguity, the VSA layer scores 1.000 where a cosine-RAG baseline scores 0.600 chance on the ambiguous pairs . pip install -e ". dev " core + tests pip install -e ". dev,api " + FastAPI service for the Phase 1 layer Python ≥ 3.11. Phase 2 extraction needs an LLM key DEEPSEEK API KEY , or XAI /GROQ in the environment or a local .env . python from astrum verum import CognitiveMemory mem = CognitiveMemory mem.remember "Maya founded Helix. Iris mentored Maya." mem.recall object "Maya", "founded" → "Helix" mem.recall subject "mentored", "Maya" → "Iris" direction matters mem.recall object "Maya", "mentors" → "the juniors" Episodes: order is first-class eid = mem.remember conversation "greeted the user", "reviewed the results", "scheduled a follow-up call", mem.whats next eid, "reviewed the results" → "scheduled a follow-up call" mem.save "~/.astrum verum/memory state" persists across sessions mem2 = CognitiveMemory.load "~/.astrum verum/memory state" You can also add facts directly no LLM via mem.remember triple s, r, o . Role-sensitive recall — distinguishes X r Y from Y r X . Error-correcting cleanup — recovers the canonical fact from a noisy cue. Episodic order — "what happened, and in what sequence". One-shot writes & persistence — no reindexing; survives restarts. python from astrum verum import AstrumEngine from astrum verum.lattice import E8Plugin engine = AstrumEngine lattice=E8Plugin D₄ by default engine.add "Photosynthesis converts sunlight into chemical energy" engine.search "plant biology" This works and the geometry is correct, but see the design doc §1 for why its retrieval-quality thesis is unproven the bottleneck is the 384→d projection, not the lattice . A REST API is available via uvicorn astrum verum.api:app . pytest tests/test vsa memory.py -q VSA layer no network PYTHONPATH=. python experiments/vsa sdm/phase0 algebra.py algebra on clean atoms PYTHONPATH=. python experiments/vsa sdm/phase1 grounding.py grounding survives real embeddings PYTHONPATH=. python experiments/vsa sdm/phase2 pipeline.py vs cosine-RAG on extracted triples needs LLM key PYTHONPATH=. python experiments/vsa sdm/phase3 full.py full CognitiveMemory end-to-end needs LLM key | Phase | Claim tested | Result | |---|---|---| | 0 | binding capacity + attractor cleanup | 100+ pairs @ D=10k; exact recovery ≤40 % noise | | 1 | grounding doesn't break binding | corr 0.988, grounding drop 0.000 | | 2 | beats cosine on real extracted data | VSA 1.000 vs RAG 0.600 role-ambiguous | | 3 | facts+episodes+normalize+persist | pytest 6/6, demo PASS | astrum verum/ vsa/ PHASE 2: VSA core MAP + VSAMemory ← the validated layer extract/ PHASE 2: LLM triple extractor DeepSeek→xAI→Groq cognitive.py PHASE 2: CognitiveMemory facade lattice/ PHASE 1: D₄ / E₈ plugins Legacy mockup engine.py … PHASE 1: lattice pipeline, store, scorer, rotation, API experiments/vsa sdm/ the Phase 2 validation arc phases 0–3 docs/astrum verum design.md full design & honest research notes Research library, not yet wired into a production agent. VSA adds structural recall — it does not replace nearest-neighbour search. Extraction/normalization on messy real dialogue is the next open problem. See design doc §5. MIT — see LICENSE /vitaliyfedotovpro-art/astrum-verum/blob/main/LICENSE .