cd /news/developer-tools/code-intelligence-mcp-server Β· home β€Ί topics β€Ί developer-tools β€Ί article
[ARTICLE Β· art-31971] src=github.com β†— pub= topic=developer-tools verified=true sentiment=↑ positive

Code Intelligence MCP Server

DeusData released Code Intelligence MCP Server, a code intelligence engine that indexes repositories in milliseconds and the Linux kernel in 3 minutes, providing structural queries in under 1ms via a single static binary for macOS, Linux, and Windows. The tool uses tree-sitter AST analysis and hybrid LSP resolution across 158 languages, offering 14 MCP tools and achieving 83% answer quality with 10x fewer tokens in benchmarks.

read20 min views3 publishedJun 18, 2026

The fastest and most efficient code intelligence engine for AI coding agents. Full-indexes an average repository in milliseconds, the Linux kernel (28M LOC, 75K files) in 3 minutes. Answers structural queries in under 1ms. Ships as a single static binary for macOS, Linux, and Windows β€” download, run install

, done.

High-quality parsing through tree-sitter AST analysis across all 158 languages, enhanced with Hybrid LSP semantic type resolution for Python, TypeScript / JavaScript / JSX / TSX, PHP, C#, Go, C, C++, Java, Kotlin, and Rust β€” producing a persistent knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links. 14 MCP tools. Zero dependencies. Plug and play across 11 coding agents.

Researchβ€” The design and benchmarks behind this project are described in the preprint[(arXiv:2603.27277). Evaluated across 31 real-world repositories: 83% answer quality, 10Γ— fewer tokens, 2.1Γ— fewer tool calls vs. file-by-file exploration.]Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP

Security & Trustβ€” This tool reads your codebase and writes to your agent configuration files. That is what it is designed to do. If you prefer to audit before running, the[full source is here]β€” every release binary is signed, checksummed, and scanned by 70+ antivirus engines. All processing happens 100% locally; your code never leaves your machine. Found a security issue? We want to know β€” see[SECURITY.md]. Security is Priority #1 for us.

Built-in 3D graph visualization (UI variant) β€” explore your knowledge graph at localhost:9749

Extreme indexing speedβ€” Linux kernel (28M LOC, 75K files) in 3 minutes. RAM-first pipeline: LZ4 compression, in-memory SQLite, fused Aho-Corasick pattern matching. Memory released after indexing.Plug and playβ€” single static binary for macOS (arm64/amd64), Linux (arm64/amd64), and Windows (amd64). No Docker, no runtime dependencies, no API keys. Download β†’install

β†’ restart agent β†’ done.158 languagesβ€” vendored tree-sitter grammars compiled into the binary. Nothing to install, nothing that breaks.** 120x fewer tokens**β€” 5 structural queries: ~3,400 tokens vs ~412,000 via file-by-file search. One graph query replaces dozens of grep/read cycles.11 agents, one commandβ€”install

auto-detects Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, OpenClaw, and Kiro β€” configures MCP entries, instruction files, and pre-tool hooks for each.Built-in graph visualizationβ€” 3D interactive UI atlocalhost:9749

(optional UI binary variant).Infrastructure-as-code indexingβ€” Dockerfiles, Kubernetes manifests, and Kustomize overlays indexed as graph nodes with cross-references.Resource

nodes for K8s kinds,Module

nodes for Kustomize overlays withIMPORTS

edges to referenced resources.14 MCP toolsβ€” search, trace, architecture, impact analysis, Cypher queries, dead code detection, cross-service HTTP linking, ADR management, and more.

One-line install (macOS / Linux):

curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash

With graph visualization UI:

curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash -s -- --ui

Windows (PowerShell):

Invoke-WebRequest -Uri https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.ps1 -OutFile install.ps1

notepad install.ps1

.\install.ps1

Options: --ui

(graph visualization), --skip-config

(binary only, no agent setup), --dir=<path>

(custom location).

Restart your coding agent. Say "Index this project" β€” done.

Manual install #

Download the archive for your platform from thelatest release:codebase-memory-mcp-<os>-<arch>.tar.gz

(macOS/Linux) or.zip

(Windows) β€” standardcodebase-memory-mcp-ui-<os>-<arch>.tar.gz

/.zip

β€” with graph visualization

Extract and install(each archive includesinstall.sh

orinstall.ps1

):macOS / Linux:

tar xzf codebase-memory-mcp-*.tar.gz
./install.sh

Windows (PowerShell):

Expand-Archive codebase-memory-mcp-windows-amd64.zip -DestinationPath .
.\install.ps1

Restart your coding agent.

The install

command automatically strips macOS quarantine attributes and ad-hoc signs the binary β€” no manual xattr

/codesign

needed.

The install

command auto-detects all installed coding agents and configures MCP server entries, instruction files, skills, and pre-tool hooks for each.

If you downloaded the ui

variant:

codebase-memory-mcp --ui=true --port=9749

Open http://localhost:9749

in your browser. The UI runs as a background thread alongside the MCP server β€” it's available whenever your agent is connected.

Enable automatic indexing on MCP session start:

codebase-memory-mcp config set auto_index true

When enabled, new projects are indexed automatically on first connection. Previously-indexed projects are registered with the background watcher for ongoing git-based change detection. Configurable file limit: config set auto_index_limit 50000

.

codebase-memory-mcp update

The MCP server also checks for updates on startup and notifies on the first tool call if a newer release is available.

codebase-memory-mcp uninstall

Removes all agent configs, skills, hooks, and instructions. Does not remove the binary or SQLite databases.

Architecture overview:get_architecture

returns languages, packages, entry points, routes, hotspots, boundaries, layers, and clusters in a single callArchitecture Decision Records:manage_adr

persists architectural decisions across sessionsLouvain community detection: Discovers functional modules by clustering call edges** Git diff impact mapping**:detect_changes

maps uncommitted changes to affected symbols with risk classificationCall graph: Resolves function calls across files and packages (import-aware, type-inferred)** Dead code detection**: Finds functions with zero callers, excluding entry points** Cypher-like queries**:MATCH (f:Function)-[:CALLS]->(g) WHERE f.name = 'main' RETURN g.name

Semantic search(semantic_query

): vector search across the entire graph, powered by bundled Nomicnomic-embed-code

embeddings (40K tokens, 768d int8) compiled into the binary β€” no API key, no Ollama, no Docker. 11-signal combined scoring (TF-IDF, RRI, API/Type/Decorator signatures, AST profiles, data flow, Halstead-lite, MinHash, module proximity, graph diffusion).BM25 full-text search via SQLite FTS5 withcbm_camel_split

tokenizer (camelCase / snake_case aware)Structural search(search_graph

): regex name patterns, label filters, min/max degree, file scopingCode search(search_code

): graph-augmented grep over indexed files only

HTTP route ↔ call-site matching with confidence scoringgRPC, GraphQL, tRPC service detection with protobuf Route extractionChannel detection(EMITS

/LISTENS_ON

) for Socket.IO, EventEmitter, and generic pub-sub patterns across 8 languages with constant resolution

link nodes across multiple repos indexed under the same storeCROSS_*

edgesMulti-galaxy 3D UI layout for cross-repo architecture visualizationCross-repo architecture summary combining services, routes, and dependencies across the indexed fleet

CALLS

,IMPORTS

,DEFINES

,IMPLEMENTS

,INHERITS

HTTP_CALLS

,ASYNC_CALLS

(cross-service)EMITS

,LISTENS_ON

(channels)DATA_FLOWS

with arg-to-param mapping + field access chainsSIMILAR_TO

(MinHash + LSH near-clone detection, Jaccard scored)SEMANTICALLY_RELATED

(vocabulary-mismatch, same-language, score β‰₯ 0.80)

158 vendored tree-sitter grammars compiled into the binaryGeneric package / module resolutionβ€” bare specifiers like@myorg/pkg

,github.com/foo/bar

,use my_crate::foo

resolved via manifest scanning (package.json

,go.mod

,Cargo.toml

,pyproject.toml

,composer.json

,pubspec.yaml

,pom.xml

,build.gradle

,mix.exs

,*.gemspec

)Infrastructure-as-code indexingβ€” Dockerfiles, Kubernetes manifests, Kustomize overlays as graph nodesfor Python, TypeScript / JavaScript / JSX / TSX, PHP, C#, Go, C, C++, Java, Kotlin, and Rust β€” a lightweight C implementation of language type-resolution algorithms, structurally inspired by and compatible with major language servers including tsserver / typescript-go, pyright, gopls, Roslyn, Eclipse JDT, and rust-analyzer (parameter binding, return-type inference, generic substitution, JSX component dispatch, JSDoc inference for plain JS files, namespace + trait + late-static-binding resolution for PHP, file-scoped namespaces + records + LINQ method syntax for C#, class-hierarchy + overload + lambda resolution for Java, extension-function + scope-function resolution for Kotlin, trait-method + UFCS resolution for Rust)Hybrid LSP semantic type resolutionRAM-first pipeline: LZ4 compression, in-memory SQLite, single dump at end. Memory released after.

Single static binary, zero infrastructure: SQLite-backed, persists to~/.cache/codebase-memory-mcp/

Auto-sync: Background watcher detects file changes and re-indexes automatically** Route nodes**: REST endpoints are first-class graph entities** CLI mode**:codebase-memory-mcp cli search_graph '{"name_pattern": ".*Handler.*"}'

Available on: npm, PyPI, Homebrew, Scoop, Winget, Chocolatey, AUR,go install

Commit a single compressed file to your repo and your teammates skip the reindex.

.codebase-memory/graph.db.zst

is a zstd-compressed snapshot of the knowledge graph that lives next to your source. When you index, the artifact is written or refreshed; when a teammate clones the repo and runs codebase-memory-mcp

for the first time, the artifact is decompressed and incremental indexing fills in their local diff.

Format: SQLite database, indexes stripped,VACUUM INTO

compacted, then zstd 1.5.7 compressed (8–13:1 ratio typical)Two tiers:** Best**(zstd -9

  • index strip +VACUUM INTO

) β€” written on explicitindex_repository

Fast(zstd -3

) β€” written by the watcher for low-latency incremental updates

Bootstrap: when no local DB exists but the artifact is present,index_repository

imports the artifact first, then runs incremental indexing β€” avoiding the full reindex costNo merge pain: a.gitattributes

line withmerge=ours

is auto-created on first export, so concurrent edits don't produce conflicts on the binary artifactOptional: never committed unless you want it. Add.codebase-memory/

to.gitignore

if you prefer everyone to reindex from scratch.

The result is similar in spirit to graphify's graphify-out/

directory, but as a single compressed file with explicit two-tier export, integrity-checked import, and zero merge friction.

codebase-memory-mcp is a structural analysis backend β€” it builds and queries the knowledge graph. It does not include an LLM. Instead, it relies on your MCP client (Claude Code, or any MCP-compatible agent) to be the intelligence layer.

You: "what calls ProcessOrder?"

Agent calls: trace_path(function_name="ProcessOrder", direction="inbound")

codebase-memory-mcp: executes graph query, returns structured results

Agent: presents the call chain in plain English

Why no built-in LLM? Other code graph tools embed an LLM for natural language β†’ graph query translation. This means extra API keys, extra cost, and another model to configure. With MCP, the agent you're already talking to is the query translator.

Benchmarked on Apple M3 Pro:

Operation Time Notes
Linux kernel full index
3 min
28M LOC, 75K files β†’ 4.81M nodes, 7.72M edges
Linux kernel fast index 1m 12s 1.88M nodes
Django full index ~6s 49K nodes, 196K edges
Cypher query <1ms Relationship traversal
Name search (regex) <10ms SQL LIKE pre-filtering
Dead code detection ~150ms Full graph scan with degree filtering
Trace call path (depth=5) <10ms BFS traversal

RAM-first pipeline: All indexing runs in memory (LZ4 HC compressed read, in-memory SQLite, single dump at end). Memory is released back to the OS after indexing completes.

Token efficiency: Five structural queries consumed ~3,400 tokens via codebase-memory-mcp versus ~412,000 tokens via file-by-file grep exploration β€” a 99.2% reduction.

Platform Standard With Graph UI
macOS (Apple Silicon) codebase-memory-mcp-darwin-arm64.tar.gz
codebase-memory-mcp-ui-darwin-arm64.tar.gz
macOS (Intel) codebase-memory-mcp-darwin-amd64.tar.gz
codebase-memory-mcp-ui-darwin-amd64.tar.gz
Linux (x86_64) codebase-memory-mcp-linux-amd64.tar.gz
codebase-memory-mcp-ui-linux-amd64.tar.gz
Linux (ARM64) codebase-memory-mcp-linux-arm64.tar.gz
codebase-memory-mcp-ui-linux-arm64.tar.gz
Windows (x86_64) codebase-memory-mcp-windows-amd64.zip
codebase-memory-mcp-ui-windows-amd64.zip

Every release includes checksums.txt

with SHA-256 hashes. All binaries are statically linked β€” no shared library dependencies.

Windows note: SmartScreen may show a warning for unsigned software. Click"More info"β†’"Run anyway". Verify integrity withchecksums.txt

.

Automated download + install #

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/scripts/setup.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/scripts/setup-windows.ps1 | iex
yay -S codebase-memory-mcp-bin
paru -S codebase-memory-mcp-bin

The codebase-memory-mcp-bin

package is available at: https://aur.archlinux.org/packages/codebase-memory-mcp-bin

You: "Install this MCP server: https://github.com/DeusData/codebase-memory-mcp"

Prerequisites: C compiler + zlib #

Requirement Check Install
C compiler (gcc or clang)
gcc --version or clang --version
macOS: xcode-select --install , Linux: apt install build-essential
C++ compiler
g++ --version or clang++ --version
Same as above
zlib
β€” macOS: included, Linux: apt install zlib1g-dev
Git
git --version
Pre-installed on most systems
git clone https://github.com/DeusData/codebase-memory-mcp.git
cd codebase-memory-mcp
scripts/build.sh                    # standard binary
scripts/build.sh --with-ui          # with graph visualization

If you prefer not to use the install command #

Add to ~/.claude/.mcp.json

(global) or project .mcp.json

:

{
  "mcpServers": {
    "codebase-memory-mcp": {
      "command": "/path/to/codebase-memory-mcp",
      "args": []
    }
  }
}

Restart your agent. Verify with /mcp

β€” you should see codebase-memory-mcp

with 14 tools.

install

auto-detects and configures all installed agents:

Agent MCP Config Instructions Hooks
Claude Code .claude/.mcp.json
4 Skills PreToolUse (Grep/Glob graph augment, non-blocking)
Codex CLI .codex/config.toml
.codex/AGENTS.md
SessionStart reminder
Gemini CLI .gemini/settings.json
.gemini/GEMINI.md
BeforeTool (grep reminder) + SessionStart reminder
Zed settings.json (JSONC)
β€” β€”
OpenCode opencode.json
AGENTS.md
β€”
Antigravity .gemini/config/mcp_config.json (shared)
antigravity-cli/AGENTS.md
SessionStart reminder
Aider β€” CONVENTIONS.md
β€”
KiloCode mcp_settings.json
~/.kilocode/rules/
β€”
VS Code Code/User/mcp.json
β€” β€”
OpenClaw openclaw.json
β€” β€”
Kiro .kiro/settings/mcp.json
β€” β€”

Hooks are structurally non-blocking (exit code 0, every failure path). For Claude Code, the PreToolUse

hook intercepts Grep

/Glob

(never Read

β€” gating Read

breaks the read-before-edit invariant) and, when the search token matches indexed symbols, injects them as additionalContext

via search_graph

so the agent gets structured context alongside its normal search results. For Codex, Gemini CLI, and Antigravity, a SessionStart

hook injects a one-line code-discovery reminder as session context (Gemini CLI also keeps its BeforeTool

reminder). The installed Claude shim file is named cbm-code-discovery-gate

for backward compatibility with existing installs; despite the legacy name it never gates and never blocks.

Every MCP tool can be invoked from the command line:

codebase-memory-mcp cli index_repository '{"repo_path": "/path/to/repo"}'
codebase-memory-mcp cli search_graph '{"name_pattern": ".*Handler.*", "label": "Function"}'
codebase-memory-mcp cli trace_path '{"function_name": "Search", "direction": "both"}'
codebase-memory-mcp cli query_graph '{"query": "MATCH (f:Function) RETURN f.name LIMIT 5"}'
codebase-memory-mcp cli list_projects
codebase-memory-mcp cli --raw search_graph '{"label": "Function"}' | jq '.results[].name'
Tool Description
index_repository
Index a repository into the graph. Auto-sync keeps it fresh after that.
list_projects
List all indexed projects with node/edge counts.
delete_project
Remove a project and all its graph data.
index_status
Check indexing status of a project.
Tool Description
search_graph
Structured search by label, name pattern, file pattern, degree filters. Pagination via limit/offset.
trace_path
BFS traversal β€” who calls a function and what it calls (alias: trace_call_path ). Depth 1-5.
detect_changes
Map git diff to affected symbols + blast radius with risk classification.
query_graph
Execute Cypher-like graph queries (read-only).
get_graph_schema
Node/edge counts, relationship patterns, property definitions per label. Run this first.
get_code_snippet
Read source code for a function by qualified name.
get_architecture
Codebase overview: languages, packages, routes, hotspots, clusters, ADR.
search_code
Grep-like text search within indexed project files.
manage_adr
CRUD for Architecture Decision Records.
ingest_traces
Ingest runtime traces to validate HTTP_CALLS edges.

Project

, Package

, Folder

, File

, Module

, Class

, Function

, Method

, Interface

, Enum

, Type

, Route

, Resource

CONTAINS_PACKAGE

, CONTAINS_FOLDER

, CONTAINS_FILE

, DEFINES

, DEFINES_METHOD

, IMPORTS

, CALLS

, HTTP_CALLS

, ASYNC_CALLS

, IMPLEMENTS

, HANDLES

, USAGE

, CONFIGURES

, WRITES

, MEMBER_OF

, TESTS

, USES_TYPE

, FILE_CHANGES_WITH

get_code_snippet

uses qualified names: <project>.<path_parts>.<name>

. Use search_graph

to discover them first.

query_graph

is a read-only openCypher subset:

Clauses:MATCH

,OPTIONAL MATCH

, multipleMATCH

,WHERE

,WITH

(+WITH … WHERE

),RETURN

,ORDER BY

,SKIP

,LIMIT

,DISTINCT

,UNWIND

,UNION

/UNION ALL

,CASE

.Patterns: labelled nodes, label alternation(n:A|B)

, relationship types/direction, variable-length paths[*1..3]

, inline property maps.WHERE:= <> < <= > >=

,AND/OR/XOR/NOT

,IN

,CONTAINS

,STARTS WITH

,ENDS WITH

,IS [NOT] NULL

, regex=~

, label testn:Label

, andEXISTS { (n)-[:TYPE]->() }

(single-hop existence β€” great for dead-code, e.g.WHERE NOT EXISTS { (f)<-[:CALLS]-() }

).Aggregates:count

(+DISTINCT

),sum

,avg

,min

,max

,collect

.Functions:labels

,type

,id

,keys

,properties

;toLower/toUpper/toString/toInteger/toFloat/toBoolean

;size

,length

,trim/ltrim/rtrim

,reverse

;coalesce

,substring

,replace

,left

,right

.

Anything outside this subset (write/MERGE

/CALL

clauses, unsupported functions, list/map literals, comprehensions, path functions, parameters) fails with a clear unsupported … error rather than returning empty results.

Layered: hardcoded patterns (.git

, node_modules

, etc.) β†’ .gitignore

hierarchy β†’ .cbmignore

(project-specific, gitignore syntax). Symlinks are always skipped.

codebase-memory-mcp config list                          # show all settings
codebase-memory-mcp config set auto_index true           # auto-index on session start
codebase-memory-mcp config set auto_index_limit 50000    # max files for auto-index
codebase-memory-mcp config reset auto_index              # reset to default
Variable Default Description
CBM_CACHE_DIR
~/.cache/codebase-memory-mcp
Override the database storage directory. All project indexes and config are stored here.
CBM_DIAGNOSTICS
false
Set to 1 or true to enable periodic diagnostics output to /tmp/cbm-diagnostics-<pid>.json .
CBM_DOWNLOAD_URL
(GitHub releases)
Override the download URL for updates. Used for testing or self-hosted deployments.
CBM_LOG_LEVEL
info
Set the minimum log level. Accepted values (case-insensitive): debug , info , warn , error , none β€” or their numeric equivalents 0 –4 matching the internal enum. Logs go to stderr; stdout is reserved for MCP JSON-RPC.
CBM_WORKERS
(detected)
Override the parallel-indexing worker count returned by cbm_default_worker_count . Useful inside containers where sysconf(_SC_NPROCESSORS_ONLN) reports host CPUs rather than the cgroup's effective quota. Range 1–256; invalid values are ignored with a warning.
export CBM_CACHE_DIR=~/my-projects/cbm-data

Map additional file extensions to supported languages via JSON config files. Useful for framework-specific extensions like .blade.php

(Laravel) or .mjs

(ES modules).

Per-project (in your repo root):

// .codebase-memory.json
{"extra_extensions": {".blade.php": "php", ".mjs": "javascript"}}

Global (applies to all projects):

// ~/.config/codebase-memory-mcp/config.json  (or $XDG_CONFIG_HOME/...)
{"extra_extensions": {".twig": "html", ".phtml": "php"}}

Project config overrides global for conflicting extensions. Unknown language values are silently skipped. Missing config files are ignored.

SQLite databases stored at ~/.cache/codebase-memory-mcp/

. Persists across restarts (WAL mode, ACID-safe). To reset: rm -rf ~/.cache/codebase-memory-mcp/

.

Problem Fix
/mcp doesn't show the server
Check .mcp.json path is absolute. Restart agent. Test: `echo '{}' /path/to/binary` should output JSON.
index_repository fails
Pass absolute path: index_repository(repo_path="/absolute/path")
trace_path returns 0 results
Use search_graph(name_pattern=".*PartialName.*") first to find the exact name.
Queries return wrong project results Add project="name" parameter. Use list_projects to see names.
Binary not found after install Add to PATH: export PATH="$HOME/.local/bin:$PATH"
UI not Ensure you downloaded the ui variant and ran --ui=true . Check http://localhost:9749 .

Semantic type resolution beyond tree-sitter.

Tree-sitter alone gives a syntactic AST. That handles naming, structure, and call sites well, but it can't tell you that user.profile.display_name()

resolves to Profile.display_name

declared three modules away β€” tree-sitter doesn't track imports, generics, inheritance, or stdlib types.

codebase-memory-mcp ships a lightweight C implementation of language type-resolution algorithms, structurally inspired by and compatible with major language servers (tsserver / typescript-go, pyright, gopls, Roslyn, Eclipse JDT, rust-analyzer), embedded directly into the static binary. No language server process, no per-project setup, no API key. We call this layer Hybrid LSP: it runs alongside tree-sitter on every parse and refines CALLS

, USAGE

, and RESOLVED_CALLS

edges with type information, so the resulting graph mirrors what an IDE "Go to Definition" would resolve.

Languages with full Hybrid LSP:

Language What it handles
Python (new in v0.7.0)
imports + dotted submodule walks, dataclasses, Self return types, generics, @property , match/case class patterns, SQLAlchemy 2.0 Mapped[T] , Pydantic BaseModel , typing.Annotated / ClassVar / Final / InitVar , async/await, classmethod/staticmethod, narrowing (isinstance / is not None / walrus), typing.cast / assert_type , common stdlib (logging, pathlib, json, functools). Target ~95% resolution on idiomatic code.
TypeScript / JavaScript / JSX / TSX
generics, JSX component dispatch, JSDoc inference for plain JS, .d.ts declarations, module re-exports, method chaining via return-type propagation, per-file overlay chained to a shared cross-file registry
PHP (new in v0.7.0)
namespaces, traits, late-static-binding, PHPDoc inference, parameter binding, return-type inference
C# (new in v0.7.0)
global usings, file-scoped namespaces, records (incl. C# 12 primary constructors), LINQ method syntax, async Task<T> / ValueTask<T> unwrap, generic methods, this / base dispatch, var inference, common BCL stdlib
Go (sharpened in v0.7.0)
pre-built per-package cross-file registry, generics, embedded structs, interface satisfaction, package-aware import resolution
C / C++ (sharpened in v0.7.0)
pre-built per-language cross-file registry shared across C and C++; C side handles macros + typedef chains + header-vs-source linking; C++ side handles templates, namespaces, auto inference, and method resolution via class hierarchy
Java (new in v0.8.0)
imports (single-type, on-demand, static), class hierarchies with this / super dispatch, generics, annotations, overload matching by arity and parameter types, lambdas / method references bound to functional interfaces, field-type inference, common JDK stdlib
Kotlin (new in v0.8.0)
imports + same-package resolution, classes / objects / companion objects, extension functions, data classes, nullable-type unwrapping, scope functions (let / apply / run / also / with ), infix calls, common stdlib
Rust (new in v0.8.0)
use declarations + module paths, impl blocks and trait methods, struct fields, generics with trait bounds, operator-trait desugaring, derive-macro method synthesis, UFCS static paths, common std prelude

Two-layer architecture:

Tree-sitter passβ€” fast, syntactic, runs for every one of the 158 languages. Extracts definitions, calls, imports.** Hybrid LSP pass**β€” type-aware, runs above the tree-sitter pass per-language. Refines call edges using the import graph plus a per-file or pre-built cross-file definition registry. Languages without a Hybrid LSP pass yet fall back to textual resolution, so you always getsomeanswer.

The result is a knowledge graph accurate enough to drive trace_path

across packages, inheritance hierarchies, and stdlib calls β€” without paying for a language server process per project.

158 languages, all parsed via vendored tree-sitter grammars compiled into the binary. Benchmarked against 64 real open-source repositories (78 to 49K nodes):

Tier Score Languages
Excellent (>= 90%)
Lua, Kotlin, C++, Perl, Objective-C, Groovy, C, Bash, Zig, Swift, CSS, YAML, TOML, HTML, SCSS, HCL, Dockerfile
Good (75-89%)
Python, TypeScript, TSX, Go, Rust, Java, R, Dart, JavaScript, Erlang, Elixir, Scala, Ruby, PHP, C#, SQL
Functional (< 75%)
OCaml, Haskell

Also supported (not yet benchmarked): Ada, Agda, Apex, Assembly (NASM), Astro, AWK, Beancount, BibTeX, Bicep, Bitbake, Blade, Cairo, Cap'n Proto, Clojure, CMake, COBOL, Common Lisp, Crystal, CSV, CUDA, D, Devicetree, Diff, .env, Elm, Emacs Lisp, F#, Fennel, Fish, FORM, Fortran, FunC, GDScript, .gitattributes, .gitignore, Gleam, GLSL, GN, Go module, Go template, GraphQL, Hare, HLSL, Hyprlang, INI, ISPC, Janet, Jinja2, JSDoc, JSON, JSON5, Jsonnet, Julia, Just, Kconfig, KDL, Lean 4, Linker Script, Liquid, LLVM IR, Luau, Magma, Makefile, Markdown, MATLAB, Mermaid, Meson, Move, Nickel, Nim, Nix, Odin, Pascal, Pkl, PO (gettext), Pony, PowerShell, Prisma, .properties, Protobuf, Puppet, PureScript, Racket, Regex, requirements.txt, ReScript, RON, reStructuredText, Scheme, Slang, Smali, Smithy, Solidity, SOQL, SOSL, Squirrel, SSH config, Starlark, Svelte, Sway, SystemVerilog, TableGen, Tcl, Teal, Templ, Thrift, TLA+, Typst, Verilog, VHDL, Vim script, Vue, WGSL, WIT, Wolfram, XML, Zsh.

src/
  main.c              Entry point (MCP stdio server + CLI + install/update/config)
  mcp/                MCP server (14 tools, JSON-RPC 2.0, session detection, auto-index)
  cli/                Install/uninstall/update/config (10 agents, hooks, instructions)
  store/              SQLite graph storage (nodes, edges, traversal, search, Louvain)
  pipeline/           Multi-pass indexing (structure β†’ definitions β†’ calls β†’ HTTP links β†’ config β†’ tests)
  cypher/             Cypher query lexer, parser, planner, executor
  discover/           File discovery (.gitignore, .cbmignore, symlink handling)
  watcher/            Background auto-sync (git polling, adaptive intervals)
  traces/             Runtime trace ingestion
  ui/                 Embedded HTTP server + 3D graph visualization
  foundation/         Platform abstractions (threads, filesystem, logging, memory)
internal/cbm/         Vendored tree-sitter grammars (158 languages) + AST extraction engine

Every release binary is verified through a multi-layer pipeline before publication:

VirusTotalβ€” all binaries scanned by 70+ antivirus engines (zero detections required to publish)** SLSA Level 3**β€” cryptographic build provenance generated by GitHub Actions; verify withgh attestation verify <file> --repo DeusData/codebase-memory-mcp

Sigstore cosignβ€” keyless signatures on all artifacts; bundles included in every release** SHA-256 checksums**β€”checksums.txt

published with every release; verified by both install scripts before extractionCodeQL SASTβ€” blocks release pipeline if any open alerts remain** Zero runtime dependencies**β€” no transitive supply chain; all libraries vendored at compile time

Binary SHA-256 VirusTotal
linux-amd64
8e12bb2d6ead7f20a6d3...

linux-arm64

10f7136bfbf3950c6b2a...

0/72 βœ…darwin-arm64

7062a7408906344bf4f8...

0/72 βœ…darwin-amd64

28c6d640e1a0ac7bfcab...

0/72 βœ…windows-amd64

9c3ddcf78368fd4fa891...

0/72 βœ…Scan links for every release are also included in the GitHub Release notes automatically.

MIT

── more in #developer-tools 4 stories Β· sorted by recency
── more on @deusdata 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/code-intelligence-mc…] indexed:0 read:20min 2026-06-18 Β· β€”