{"slug": "graphify-open-source-knowledge-graph-skill-for-ai-coding-assistants", "title": "Graphify – Open-Source Knowledge Graph Skill for AI Coding Assistants", "summary": "Safi Shamsi released Graphify, an open-source skill that builds queryable knowledge graphs from multi-modal codebases for AI coding assistants like Claude Code and OpenAI Codex. The tool combines Tree-sitter static analysis with LLM-driven semantic extraction to help assistants understand code structure and design rationale, achieving 3.7k GitHub stars and a 71.5x token reduction.", "body_md": "# Graphify — Knowledge Graphs\n\nfor AI Coding Assistants\n\nGraphify is an open-source skill that helps AI coding assistants understand multi-modal codebases by building a queryable knowledge graph from code, docs, papers and diagrams.\n\n`pip install graphifyy`\n\n## What is Graphify?\n\nGraphify is a multi-modal knowledge graph builder created for AI coding assistants such as Claude Code, OpenAI Codex and OpenCode. By combining Tree-sitter static analysis with LLM-driven semantic extraction, Graphify turns an entire repository — including source code, documentation, research papers and diagrams — into an interactive graph that explains both *what* the code does and *why* it was designed that way. The project is maintained by Safi Shamsi, released under the permissive MIT license, and built on widely-trusted libraries including NetworkX and Tree-sitter.\n\n**3.7k+** GitHub Stars\n\n**MIT** License\n\n**71.5×** Token Reduction\n\n**Python 3.10+** Runtime\n\n## Core Capabilities\n\nGraphify unifies static analysis, semantic extraction and graph clustering into a single skill that any AI coding assistant can invoke.\n\n### Multi-Modal Extraction\n\nParses code (.py, .js, .go, .java, …), Markdown, PDFs and images. Tree-sitter extracts ASTs, call graphs and docstrings; LLMs extract concepts from prose; vision models read diagrams.\n\n### Knowledge Graph Build\n\nMerges all extracted nodes and edges into a NetworkX graph and applies the Leiden algorithm for semantic community detection — no vector embeddings required.\n\n### God Nodes & Surprises\n\nIdentifies the highest-degree \"god nodes\" at the heart of the system and flags unexpected cross-file or cross-domain connections worth investigating.\n\n### Interactive Outputs\n\nExports an interactive `graph.html`\n\n, a queryable `graph.json`\n\n, and a human-readable `GRAPH_REPORT.md`\n\naudit report.\n\n### Assistant Integration\n\nShips with `/graphify`\n\n, `/graphify query`\n\n, `/graphify path`\n\nand `/graphify explain`\n\ncommands for Claude Code, Codex, OpenCode and more.\n\n### Secure by Design\n\nStrict input validation: only http/https URLs, size and timeout limits, path containment, HTML-escaped node labels — defending against SSRF, injection and XSS.\n\n## Architecture & Pipeline\n\nGraphify is a multi-stage pipeline. Each stage is an isolated module so contributors can extend any step independently.\n\n**detect**— collect files\n\n**extract**— AST + LLM nodes/edges\n\n**build**— NetworkX graph\n\n**cluster**— Leiden communities\n\n**analyze**— god nodes & surprises\n\n**report**— GRAPH_REPORT.md\n\n**export**— HTML / JSON / Obsidian\n\nSupporting modules include `ingest.py`\n\nfor URL fetching, `cache.py`\n\nfor semantic caching, `security.py`\n\nfor input validation, `watch.py`\n\nfor live updates and `serve.py`\n\nfor MCP-protocol service.\n\n## Install & Run\n\nGraphify is distributed on PyPI. The package name is `graphifyy`\n\n; the CLI command remains `graphify`\n\n.\n\n```\n# Requires Python 3.10+\npip install graphifyy && graphify install\n\n# Build a knowledge graph for any project folder\n/graphify ./raw\n\n# Outputs land in graphify-out/\ngraphify-out/\n├── graph.html        # interactive visualization\n├── GRAPH_REPORT.md   # core nodes, surprises, suggested questions\n├── graph.json        # persistent, queryable graph\n└── cache/            # incremental cache\n```\n\nGraphify does not bundle an LLM. It uses the model API key already configured by your AI coding assistant (Claude, Codex, etc.) and only sends semantic content — never raw source code — to the upstream model.\n\n## Worked Examples\n\nThe repository ships with reproducible corpora demonstrating Graphify on both small libraries and large mixed code-and-paper collections.\n\n### httpx (small)\n\n6 Python files modeling an HTTP transport layer. Result: **144 nodes, 330 edges, 6 communities**. God nodes: `Client`\n\n, `AsyncClient`\n\n, `Response`\n\n, `Request`\n\n. Surprise edge: `DigestAuth → Response`\n\n.\n\n### Karpathy mixed corpus\n\n3 GPT framework repos + 5 attention papers + 4 diagrams (~52 files, ~92k words). Result: **285 nodes, 340 edges, 53 communities**. Average query cost ~1.7k tokens vs ~123k naive — a **71.5×** reduction.\n\n## Comparison\n\nHow Graphify relates to adjacent open-source projects in the code-intelligence space.\n\n| Project | Focus | Strength | Limitation vs Graphify |\n|---|---|---|---|\n| Sourcegraph | Cross-repo code search | Enterprise-grade navigation | Not a knowledge graph; limited design semantics |\n| Code2Vec | Function-level embeddings | Vector retrieval & classification | No graph structure, no multi-modal input |\n| Neo4j | General graph database | Powerful Cypher queries | Does not generate graphs from code itself |\n\n## Security, Licensing & Trust\n\nGraphify is released under the **MIT License**. Its core dependencies — NetworkX (BSD) and Tree-sitter (MIT) — are all permissive open-source licenses with no conflicts. The project performs no telemetry. The only outbound network call is the semantic-extraction step, which uses your own configured AI model API key; only semantic descriptions of documents are transmitted, never raw source code. URLs are restricted to http/https, downloads are size- and time-bounded, output paths are containment-checked, and node labels are HTML-escaped to prevent SSRF, Cypher injection and XSS.\n\n## Learn more about Graphify\n\nDeeper guides on how Graphify builds, clusters and serves knowledge graphs to AI coding assistants.\n\n### Knowledge Graphs for AI Coding Assistants\n\nWhy structural graphs beat vector RAG for code understanding.\n\n### Tree-sitter AST Extraction\n\nHow Graphify parses 19 languages locally, with no LLM calls on source.\n\n### Leiden Community Detection\n\nClustering on graph topology alone — no embeddings, no vector store.\n\n### Claude Code Integration\n\nCLAUDE.md directives and the PreToolUse hook, step by step.\n\n### CLI Command Reference\n\nEvery `/graphify`\n\nand `graphify`\n\ncommand in one place.\n\n### Graphify vs Alternatives\n\nHonest comparison against Sourcegraph, Code2Vec and Neo4j.\n\n## Frequently Asked Questions\n\n## Does Graphify send my code to a third-party model?\n\nNo. Graphify only sends semantic descriptions of documents and diagrams to the AI model you have already configured in your assistant — never raw source files.\n\n## Which AI coding assistants are supported?\n\nClaude Code, OpenAI Codex and OpenCode are supported out of the box via dedicated `skill-*.md`\n\nmanifests. Any assistant that can call shell commands can invoke `graphify`\n\n.\n\n## How large a codebase can Graphify handle?\n\nTree-sitter parsing and NetworkX construction scale linearly with code size. On a ~500k-word corpus, BFS subgraph queries stay around ~2k tokens versus ~670k naive — preserving compression at scale.\n\n## Is Graphify free for commercial use?\n\nYes. Graphify is MIT-licensed and free for both personal and commercial use.", "url": "https://wpnews.pro/news/graphify-open-source-knowledge-graph-skill-for-ai-coding-assistants", "canonical_source": "https://graphify.net/index.html#features", "published_at": "2026-06-27 22:26:37+00:00", "updated_at": "2026-06-27 23:05:22.802940+00:00", "lang": "en", "topics": ["developer-tools", "ai-tools", "large-language-models", "generative-ai", "machine-learning"], "entities": ["Graphify", "Safi Shamsi", "Claude Code", "OpenAI Codex", "OpenCode", "NetworkX", "Tree-sitter", "Leiden algorithm"], "alternates": {"html": "https://wpnews.pro/news/graphify-open-source-knowledge-graph-skill-for-ai-coding-assistants", "markdown": "https://wpnews.pro/news/graphify-open-source-knowledge-graph-skill-for-ai-coding-assistants.md", "text": "https://wpnews.pro/news/graphify-open-source-knowledge-graph-skill-for-ai-coding-assistants.txt", "jsonld": "https://wpnews.pro/news/graphify-open-source-knowledge-graph-skill-for-ai-coding-assistants.jsonld"}}