cd /news/large-language-models/how-i-found-out-52-of-my-knowledge-g… · home topics large-language-models article
[ARTICLE · art-38628] src=dev.to ↗ pub= topic=large-language-models verified=true sentiment=· neutral

How I Found Out 52% of My Knowledge Graph Was Duplicates (and What I Did About It)

A developer building ANIMUS, an autonomous Rust system for persistent LLM memory, discovered that 52% of its knowledge graph nodes were duplicates. An audit revealed an overly aggressive filter trapped the system in a loop re-exploring the same topics, inflating node count without adding new knowledge. The fix involved correcting a search function and migrating the inference engine to a local quantized Gemma 4 E2B model.

read2 min views1 publishedJun 25, 2026

I've spent the last several months building ANIMUS, an autonomous system in Rust that gives a local LLM persistent memory. The idea is simple: a knowledge graph that grows on its own, cycle after cycle, as the system reads documents, detects gaps in its knowledge, and fills them in.

For months, the metric I watched most closely was the node count of the graph. It kept climbing. I felt good about that. Until I ran a full audit and found out that 52% of those nodes were undetected duplicates. Of 1,892 reported nodes, only 911 were actually unique.

ANIMUS's autonomous loop actively looks for "gaps" — holes in its knowledge that the system decides to fill on its own. The problem: an overly aggressive filter was excluding certain categories from the gap pool, which trapped the system in a loop of re-exploring the same ~40 topics for thousands of cycles. Each pass generated content that was similar but not identical to the last — different enough to avoid triggering any exact-duplicate check, but substantially the same information rephrased.

The node count kept climbing. Actual knowledge, not so much.

The fix wasn't magic, it was audit work:

`Brain::search`

): it walked the graph from node 0 with `.take(2)`

, which meant it almost always returned stale content from earlier versions of the system. A simple .rev()

fixed it.Along the way, I also migrated the inference engine: from a Python wrapper to a llama-server.exe

launched directly from Rust, and from the original model to a quantized Gemma 4 E2B, running at ~77 tokens/second on a consumer GPU (RTX 3050, 4GB). None of this required the cloud or paid APIs — everything runs locally.

The most valuable part of this whole episode wasn't fixing the bug. It was realizing that a metric that only goes up never warns you that something is wrong. Node count was a proxy for "the system is learning," but optimizing that one proxy, with nothing to balance it, ended up producing the opposite: inflated content, not new knowledge.

ANIMUS now runs on several cross-checked signals (verified uniqueness, recency-weighted relevance, source validation) instead of one vanity metric. If two signals start to diverge, the system stops and re-audits instead of continuing to generate.

If you're curious about the full picture (architecture, benchmarks, comparison against a simple vector RAG baseline), the technical paper is open access with a DOI: 10.5281/zenodo.20674981. Code is on GitHub. ANIMUS is an independent project, developed in Santo Domingo, Dominican Republic.

── more in #large-language-models 4 stories · sorted by recency
── more on @animus 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-i-found-out-52-o…] indexed:0 read:2min 2026-06-25 ·