{"slug": "show-hn-graphenium-persistent-repo-memory-for-ai-coding-assistants", "title": "Show HN: Graphenium, persistent repo memory for AI coding assistants", "summary": "Graphenium, a new tool for AI coding assistants, provides persistent structural memory by turning a repository into a queryable graph, enabling fast navigation of large codebases without reading files. It addresses issues like cold starts and context window pressure in multi-module projects, and is compatible with MCP-based assistants like Claude and Cursor.", "body_md": "**Persistent structural memory for AI coding agents.**\n\nGraphenium turns your repository into a queryable graph so Claude, Cursor,\nand other MCP-compatible assistants can answer these in ~20 ms, without\nreading a single file. **Especially valuable in large, multi-module, or\nunfamiliar codebases** where grep-and-trace navigation breaks down:\n\n- What calls this function?\n- What depends on this module?\n- What are the architectural hubs?\n- What is the shortest path between these components?\n- Which files belong to the same community?\n\nIt replaces grep-and-trace navigation, not source-code understanding.\n\nAI coding assistants are good at reading code, but they navigate repositories\nlike a human using `grep`\n\n: search for a symbol, open the file, follow\nimports, open more files, infer relationships. Then do it all again in the\nnext session.\n\nIn a 50-file project, grep works. In a 5,000-file monorepo with deep import chains, it doesn't. That workflow has five persistent problems:\n\n**Repeated cold starts.** Every new session begins without a durable model of the repository.**Context window pressure.** Raw source files are large; navigation consumes tokens that could be used for reasoning.**No structural memory.** After reading files, the assistant has no persisted graph of how modules, functions, and concepts relate.**Missed cross-file relationships.** Grep surfaces text matches, not architectural topology, hubs, communities, or paths.**Scale multiplies the pain.** Every new file and dependency makes the grep-and-trace loop slower and more expensive. The graph stays fast regardless of repo size.\n\nGraphenium runs analysis once, persists the result as a graph, and exposes it\nto assistants via an [MCP](https://modelcontextprotocol.io) server. The graph\nbecomes the assistant's long-term memory for your repository.\n\n**What changes:**\n\n**Orientation in seconds, not minutes.**`architecture_summary`\n\ngives a 30-second map of the codebase before the assistant reads a single file.**Context stays focused.** Instead of filling the context window with raw source during navigation, the assistant reasons over compact graph output and reads only the files that matter.**Memory survives sessions.** The graph persists. A new AI session starts with the same structural knowledge the last one had.\n\n**Good at**\n\n**Navigating large codebases.** In 50+ file repos, monorepos, or unfamiliar projects, grep-and-trace wastes context; the graph replaces it.- AI-assisted code navigation: answer structural questions without repeatedly reading files.\n- Impact analysis: identify connected nodes before changing a function, class, or module.\n- Onboarding: get a high-level architectural map of an unfamiliar repo fast.\n- Refactoring planning: find god nodes, low-cohesion communities, and surprising cross-boundary edges.\n- Code review: inspect symbols, degrees, and hotspots before reviewing a changed file.\n- Keeping the graph current with watch mode during active development.\n\n**Not a replacement for**\n\n**Reading source code.** The graph captures structure and relationships, not implementation logic. An assistant still needs to read actual code.**A full language server.** It does not perform complete type checking or language-specific semantic analysis at LSP depth.**Runtime tracing.** It is static analysis plus optional LLM extraction; it does not execute the program.**Semantic search / embeddings.** Graphenium uses keyword scoring and graph traversal, not vector similarity.**Security scanning.** Relationship graphs are not a substitute for dedicated SAST tools.\n\n```\nWithout Graphenium:\ngrep → read file → trace imports → read more files → infer architecture\n\nWith Graphenium:\nquery_graph → get_neighbors → shortest_path → read only the right files\n# Build a graph for your project (no API key needed)\ngm run . --no-semantic --no-viz\n\n# Ask structural questions\ngm query \"what calls build_from_extraction?\"\n\n# Or connect an AI assistant via MCP and ask directly\ncurl -fsSL https://raw.githubusercontent.com/lambda-alpha-labs/Graphenium/main/install.sh | sh\n```\n\nRequires Rust 1.75+ ([rustup](https://rustup.rs)).\n\n```\ngit clone https://github.com/lambda-alpha-labs/Graphenium\ncd Graphenium\ncargo install --path .\n```\n\nThe binary is installed as `gm`\n\n.\n\n```\n# Build a graph (no API key needed)\ngm run . --no-semantic --no-viz\n\n# Query it\ngm query \"authentication login session\" --budget 1000\n\n# Check your installation\ngm doctor\n```\n\nGraphenium has two extraction modes. Both are useful; they serve different purposes.\n\n| Mode | What you get | Best for | API key |\n|---|---|---|---|\nAST-only |\nImports, containment, methods, symbols, structural communities | Architecture map, blast radius, orientation | No |\nSemantic |\nUses, conceptual dependencies, rationale, inferred cross-file relationships | Behavioural tracing, richer agent reasoning | Yes |\n\nAST-only mode gives the assistant a map. Semantic mode adds the traffic overlay.\n\n```\n# AST-only, local, no key needed\ngm run . --no-semantic --no-viz\n\n# Semantic: adds LLM-inferred relationships\nexport ANTHROPIC_API_KEY=sk-ant-...\ngm run . --provider anthropic    # also: openai, deepseek, openrouter\n```\n\nThe `graph_stats`\n\ntool always reports the edge confidence breakdown, so the\nassistant knows what it's working with.\n\nAdd Graphenium to your AI assistant's MCP config. The server uses the\nstandard MCP stdio transport. Or run `gm setup <target>`\n\nto print the config\nfor your assistant.\n\n**Claude Desktop** (`claude_desktop_config.json`\n\n):\n\n```\n{\n  \"mcpServers\": {\n    \"graphenium\": {\n      \"command\": \"gm\",\n      \"args\": [\"serve\", \"--graph\", \"/absolute/path/to/graphenium-out/graph.json\"]\n    }\n  }\n}\n```\n\n**Cursor** (`~/.cursor/mcp.json`\n\n):\n\n```\n{\n  \"mcpServers\": {\n    \"graphenium\": {\n      \"command\": \"gm\",\n      \"args\": [\"serve\", \"--graph\", \"/absolute/path/to/graphenium-out/graph.json\"]\n    }\n  }\n}\n```\n\n**CodeWhale** (`~/.codewhale/mcp.json`\n\n):\n\n```\n{\n  \"servers\": {\n    \"graphenium\": {\n      \"command\": \"/absolute/path/to/gm\",\n      \"args\": [\"serve\", \"--graph\", \"/absolute/path/to/graphenium-out/graph.json\"],\n      \"env\": {}\n    }\n  }\n}\n```\n\n**After updating config, quit and relaunch the AI tool completely** (Cmd+Q on\nmacOS, not just close the window). MCP servers are only loaded at startup.\n\nThe repo ships an AI Skill at `skills/graphenium/SKILL.md`\n\nthat teaches\nassistants which tool to reach for, how to interpret confidence levels, and\nhow to fall back to `gm query`\n\nwhen MCP is unavailable.\n\nOnce connected, the assistant has access to 13 graph tools.\n\n**Read tools:**\n\n| Tool | Purpose |\n|---|---|\n`graph_stats` |\nNode/edge counts, file types, confidence breakdown |\n`architecture_summary` |\nCommunities, focus paths, god nodes, confidence summary |\n`query_graph` |\nKeyword-scored BFS/DFS traversal within a token budget |\n`get_node` |\nFull node details by ID or label |\n`get_neighbors` |\nDirect neighbours with edge types and confidence |\n`get_community` |\nAll nodes in a community cluster |\n`god_nodes` |\nTop N most-connected hub nodes |\n`shortest_path` |\nPath between any two components |\n`summarize_file` |\nEvery symbol extracted from a source file |\n`reload_graph` |\nHot-swap the graph without restarting |\n\n**Write tools:**\n\n| Tool | Purpose |\n|---|---|\n`add_node` |\nRegister concepts the AST can't capture |\n`add_edge` |\nRecord relationships confirmed through inspection |\n`remove_edge` |\nCorrect false positives or stale relationships |\n\nAll writes persist to disk immediately.\n\nGraphenium models a codebase as three things.\n\nNodes represent meaningful entities: functions, methods, classes, modules, structs, traits, documents, images, and architectural concepts. Each node carries metadata: label, qualified label, file type, source file, source location, and community ID.\n\nEdges are typed, directed relationships.\n\n| Relation | Meaning | Source |\n|---|---|---|\n`imports` |\nModule-level import/include | AST |\n`contains` |\nModule/class contains a symbol | AST |\n`method` |\nMethod belongs to a class/type | AST |\n`calls` |\nFunction calls another function | AST / semantic |\n`uses` |\nCross-file usage dependency | AST / semantic |\n`inherits` |\nOOP inheritance | AST / semantic |\n`implements` |\nInterface/trait implementation | AST / semantic |\n`depends_on` |\nConceptual dependency | Semantic |\n`rationale_for` |\nDocument/comment explains code | Semantic |\n\nGraphenium analyzes the graph to surface communities, hub nodes, shortest paths, surprising cross-community connections, and architectural focus paths. The assistant can orient itself structurally before reading implementation details.\n\nEvery edge carries a confidence level.\n\n| Level | Source | How to treat it |\n|---|---|---|\n`EXTRACTED` |\nDeterministic static extraction | Ground truth, directly present in source |\n`INFERRED` |\nLLM or heuristic reasoning | Strong hint, useful for navigation; verify before risky changes |\n`AMBIGUOUS` |\nLLM-flagged uncertainty | Question to investigate, not a fact |\n\nA good assistant workflow:\n\n- Trust\n`EXTRACTED`\n\nedges as fact. - Use\n`INFERRED`\n\nedges as strong hints. - Treat\n`AMBIGUOUS`\n\nedges as leads to inspect. - Read source code before making implementation changes.\n\n`graph_stats`\n\nreports the confidence breakdown so you know what kind of graph\nyou're working with.\n\nGraphenium uses [tree-sitter](https://tree-sitter.github.io/) for AST\nextraction across 9 languages.\n\n| Language | Extensions | Extracted features |\n|---|---|---|\n| Python | `.py` |\nClasses, functions, imports, call graph |\n| JavaScript | `.js` , `.mjs` , `.cjs` |\nClasses, functions, arrow functions, imports |\n| TypeScript | `.ts` , `.tsx` |\nJavaScript features + type declarations |\n| Rust | `.rs` |\nStructs, enums, traits, impl blocks, functions, `use` |\n| Go | `.go` |\nFunctions, methods with receivers, import blocks |\n| Java | `.java` |\nClasses, methods, package imports |\n| C | `.c` , `.h` |\nFunctions, include directives |\n| C++ | `.cpp` , `.cc` , `.cxx` , `.hpp` |\nClasses, functions, include directives |\n| C# | `.cs` |\nClasses, methods, using directives, namespaces |\n\nSemantic extraction also processes documents (`.md`\n\n, `.rst`\n\n, `.txt`\n\n), PDFs,\nand images.\n\nBuild with only the languages you need:\n\n```\ncargo build --release --no-default-features --features lang-python,lang-rust\n```\n\nFeatures: `lang-python`\n\n, `lang-js`\n\n, `lang-ts`\n\n, `lang-rust`\n\n, `lang-go`\n\n,\n`lang-java`\n\n, `lang-c`\n\n, `lang-cpp`\n\n, `lang-csharp`\n\n.\n\nRun the full analysis pipeline on a directory.\n\n```\ngm run [PATH] [OPTIONS]\n```\n\n| Option | Description |\n|---|---|\n`PATH` |\nDirectory to analyse (default: `.` ) |\n`--no-semantic` |\nSkip LLM extraction; use AST-only results |\n`--no-viz` |\nSkip HTML generation |\n`--provider NAME` |\nAI provider: `anthropic` (default), `openai` , `openrouter` , `deepseek` , `openai-compatible` |\n`--model NAME` |\nModel to use (defaults to provider-specific default) |\n`--api-key KEY` |\nAPI key (overrides provider-specific env var) |\n`--api-base URL` |\nAPI base URL for `openai-compatible` provider |\n`--mode deep` |\nAggressive LLM inference |\n`--update` |\nIncremental: only re-extract changed files |\n\n```\ngm run . --no-semantic --no-viz      # Fast AST-only scan\ngm run . --provider openai           # With LLM semantic extraction\ngm run . --update                    # Incremental after editing files\n```\n\nQuery an existing graph with keywords.\n\n```\ngm query \"<keywords>\" [OPTIONS]\n```\n\n| Option | Default | Description |\n|---|---|---|\n`--graph PATH` |\n`graphenium-out/graph.json` |\nPath to graph file |\n`--budget N` |\n`2000` |\nOutput token budget |\n`--dfs` |\noff | Use depth-first search |\n\n```\ngm query \"authentication login session\"\ngm query \"parser ast walker\" --dfs --budget 4000\n```\n\nStart an MCP server exposing the graph over stdio.\n\n```\ngm serve [OPTIONS]\n```\n\n| Option | Default | Description |\n|---|---|---|\n`--graph PATH` |\n`graphenium-out/graph.json` |\nPath to graph file |\n\nWatch a directory and auto-rebuild the graph on changes.\n\n```\ngm watch [PATH] [OPTIONS]\n```\n\n| Option | Default | Description |\n|---|---|---|\n`PATH` |\n`.` |\nDirectory to watch |\n`--debounce SECS` |\n`3.0` |\nWait after last event before rebuild |\n`--incremental` |\n`true` |\nPatch changed files; `false` for full rebuild |\n\n```\ngm watch . --debounce 2.0\n```\n\nRun diagnostic checks on your Graphenium installation: binary location, graph file health, tree-sitter languages, API keys, and graph quality.\n\n```\ngm doctor [--graph PATH]\n```\n\nPrint ready-to-paste MCP config for an AI assistant.\n\n```\ngm setup <claude|cursor|codewhale> [--graph PATH]\ngm setup claude\ngm setup cursor\ngm setup codewhale\n```\n\nGraphenium writes outputs to `graphenium-out/`\n\ninside the analysed directory.\n\n| File | Purpose |\n|---|---|\n`graph.json` |\nMachine-readable graph for `gm serve` and `gm query` |\n`GRAPH_REPORT.md` |\nMarkdown architecture report |\n`graph.html` |\nSelf-contained visual graph inspection page |\n`manifest.json` |\nmtime index for incremental updates |\n`cache/` |\nPer-file semantic extraction cache (SHA256 keyed) |\n\n```\nsrc/\n  extract/     tree-sitter extraction for 9 languages\n  model/       graph, node, edge, hyperedge types\n  build/       graph construction from extraction results\n  cluster/     Louvain community detection, cohesion, split/focus\n  detect/      file classification, sensitive-file skipping, corpus warnings\n  analyze/     god nodes, surprising connections, architectural questions\n  serve/       MCP server (rmcp), tool handlers, graph traversal\n  semantic/    LLM client, prompt builder, response parser\n  export/      JSON export, HTML visualisation\n  cache/       mtime manifest, semantic extraction cache\n  watch/       file-system watcher with incremental patching\n```\n\n**AST-only graphs are structural, not behavioural.** Without semantic extraction, edges are mostly imports, containment, and method declarations. Control-flow relationships (`calls`\n\n,`uses`\n\n,`implements`\n\n) come from the semantic pass.**Label collisions.** Common names like`new`\n\n,`mod`\n\n,`run`\n\nappear across modules. Qualified labels help disambiguate when available.`graph_stats`\n\nreports collision counts so you know when results may be fuzzy.**Large corpora.** Projects with many vendored dependencies should use`.grapheniumignore`\n\nto exclude`target/`\n\n,`node_modules/`\n\n,`.rust-toolchain/`\n\n, and similar directories.\n\nContributions are welcome, especially language extractors, MCP integrations,\nand fixtures. See [CONTRIBUTING.md](/lambda-alpha-labs/Graphenium/blob/main/CONTRIBUTING.md).", "url": "https://wpnews.pro/news/show-hn-graphenium-persistent-repo-memory-for-ai-coding-assistants", "canonical_source": "https://github.com/lambda-alpha-labs/Graphenium", "published_at": "2026-06-24 11:27:02+00:00", "updated_at": "2026-06-24 11:41:02.299893+00:00", "lang": "en", "topics": ["developer-tools", "ai-tools", "ai-agents", "large-language-models", "ai-infrastructure"], "entities": ["Graphenium", "Claude", "Cursor", "MCP"], "alternates": {"html": "https://wpnews.pro/news/show-hn-graphenium-persistent-repo-memory-for-ai-coding-assistants", "markdown": "https://wpnews.pro/news/show-hn-graphenium-persistent-repo-memory-for-ai-coding-assistants.md", "text": "https://wpnews.pro/news/show-hn-graphenium-persistent-repo-memory-for-ai-coding-assistants.txt", "jsonld": "https://wpnews.pro/news/show-hn-graphenium-persistent-repo-memory-for-ai-coding-assistants.jsonld"}}