How I Found Out 52% of My Knowledge Graph Was Duplicates (and What I Did About It) A developer building ANIMUS, an autonomous Rust system for persistent LLM memory, discovered that 52% of its knowledge graph nodes were duplicates. An audit revealed an overly aggressive filter trapped the system in a loop re-exploring the same topics, inflating node count without adding new knowledge. The fix involved correcting a search function and migrating the inference engine to a local quantized Gemma 4 E2B model. I've spent the last several months building ANIMUS https://github.com/ernestoariasdiaz/animus-ai , an autonomous system in Rust that gives a local LLM persistent memory. The idea is simple: a knowledge graph that grows on its own, cycle after cycle, as the system reads documents, detects gaps in its knowledge, and fills them in. For months, the metric I watched most closely was the node count of the graph. It kept climbing. I felt good about that. Until I ran a full audit and found out that 52% of those nodes were undetected duplicates . Of 1,892 reported nodes, only 911 were actually unique. ANIMUS's autonomous loop actively looks for "gaps" — holes in its knowledge that the system decides to fill on its own. The problem: an overly aggressive filter was excluding certain categories from the gap pool, which trapped the system in a loop of re-exploring the same ~40 topics for thousands of cycles. Each pass generated content that was similar but not identical to the last — different enough to avoid triggering any exact-duplicate check, but substantially the same information rephrased. The node count kept climbing. Actual knowledge, not so much. The fix wasn't magic, it was audit work: Brain::search : it walked the graph from node 0 with .take 2 , which meant it almost always returned stale content from earlier versions of the system. A simple .rev fixed it.Along the way, I also migrated the inference engine: from a Python wrapper to a llama-server.exe launched directly from Rust, and from the original model to a quantized Gemma 4 E2B, running at ~77 tokens/second on a consumer GPU RTX 3050, 4GB . None of this required the cloud or paid APIs — everything runs locally. The most valuable part of this whole episode wasn't fixing the bug. It was realizing that a metric that only goes up never warns you that something is wrong . Node count was a proxy for "the system is learning," but optimizing that one proxy, with nothing to balance it, ended up producing the opposite: inflated content, not new knowledge. ANIMUS now runs on several cross-checked signals verified uniqueness, recency-weighted relevance, source validation instead of one vanity metric. If two signals start to diverge, the system stops and re-audits instead of continuing to generate. If you're curious about the full picture architecture, benchmarks, comparison against a simple vector RAG baseline , the technical paper is open access with a DOI: 10.5281/zenodo.20674981 https://doi.org/10.5281/zenodo.20674981 . Code is on GitHub https://github.com/ernestoariasdiaz/animus-ai . ANIMUS is an independent project, developed in Santo Domingo, Dominican Republic.