{"slug": "tokenzip-v2-prd-hld-lld", "title": "TokenZip v2 — PRD, HLD, LLD", "summary": "TokenZip v2 is a token compression engine that reduces LLM input token costs by up to 95% for coding copilots like Claude Code and Codex by transforming an entire codebase into a multi-level, queryable knowledge graph stored locally in `.tokenzip/db`. It auto-detects module boundaries, supports nested monorepo structures, and stores symbols with relationships (CALLS, IMPLEMENTS, INHERITS, etc.) using SurrealDB with RocksDB storage, enabling incremental parsing and fast queries under 100ms for repos up to 100K files. The system is exposed as an MCP server for AI copilots, kept fresh via git hooks, and includes structured markdown parsing with Mermaid block conversion and cross-reference resolution.", "body_md": "# TokenZip — PRD, HLD, LLD\n\n---\n\n# 📋 PRD — Product Requirements Document\n\n## 1. Executive Summary\n\n**TokenZip v2** transforms Karpathy's llm wiki concept into a gzip like **token compression engine** on top of entire codebase, which can reduce the LLM input token cost upto by 95% when using with Coding Copilots like Claude Code, Codex etc. Instead of generating a flat text summary, it builds a multi-level, queryable, chainable knowledge graph — from repo → modules → files → symbols — stored locally in `.tokenzip/db`, exposed as an **MCP server** for any AI copilot, and kept fresh via **git hooks**\n\n## 2. Problem Statement\n\n| Problem | Impact |\n|---|---|\n| AI copilots lack structural awareness of large codebases | They hallucinate imports, miss dependencies, suggest changes in wrong modules |\n| Text-based token references are flat and non-queryable | Cannot ask \"which functions depend on this interface?\" or \"what modules does this feature span?\" |\n| No persistent code intelligence layer | Every session re-parses from scratch, wasting tokens and time |\n| Documentation (PRD/HLD/LLD/README) is unstructured | AI can't extract workflows, sequence diagrams, or release plans from markdown |\n| Cross-language dependency tracking is manual | A SQL schema change affecting 3 TS files is invisible until runtime |\n| Cross-repository dependency tracking is manual | The current repository has no awareness of dependent or upstream repositories, including shared interfaces, API contracts, endpoint usage, schema dependencies, or cross-repo integrations — making impact analysis and coordinated changes error-prone\n  |\n| Version-aware dependency conflicts are difficult to detect | AI copilots and developers lack visibility into incompatible interface versions, breaking API/schema changes, SDK mismatches, or transitive dependency drift across repositories — causing silent integration failures and upgrade risks\n\n\n## POC Results\n\n### Under 30 seconds indexing time for a codebase with ~1950 files\n<img width=\"1639\" height=\"855\" alt=\"image\" src=\"https://gist.github.com/user-attachments/assets/f19d00a0-19c2-490f-86a6-f67452b6452f\" />\n\n### Under 1 seconds lookup.\n<img width=\"909\" height=\"637\" alt=\"image\" src=\"https://gist.github.com/user-attachments/assets/5d25d6b3-c34a-46f1-9cde-857c8e6a69ee\" />\n\n  \n\n## 3. Target Users\n\n### Primary\n- **AI Copilot Users** (Claude Code, Codex, OpenCode, Kilo Code) — need structured context without token waste\n- **Full-stack Developers** working in monorepos with 50+ modules\n\n### Secondary\n- **Tech Leads** auditing codebase structure and dependency health\n- **Onboarding Engineers** needing rapid codebase mental model\n\n## 4. Product Vision\n\n> *\"Your codebase as a queryable graph — not a text dump. Ask structural questions, get precise answers, zero hallucination.\"*\n\n## 5. Feature Specification\n\n### 5.1 Multi-Level Code Graph\n\n```\nRepository\n  └── Module (auto-detected: package.json, pyproject.toml, go.mod, Cargo.toml, etc.)\n        └── File\n              └── Symbol (function, class, interface, variable, table, column, etc.)\n```\n\n**Acceptance Criteria:**\n- [ ] Auto-detect module boundaries by presence of manifest files\n- [ ] Support nested modules (monorepo: repo → apps/web → src/components)\n- [ ] Each node has a stable UUID that survives renames (content-hash + path-hash hybrid)\n\n### 5.2 Tree-Sitter Metadata Extraction\n\n| Language | Extracted Artifacts |\n|---|---|\n| `.js`, `.mjs` | Functions, classes, exports, imports, global vars, JSDoc |\n| `.ts`, `.tsx` | Above + interfaces, type aliases, generics, enums, decorators, namespace exports |\n| `.py` | Functions, classes, decorators, type hints, imports, async defs |\n| `.sql` | Tables, views, columns, constraints, indexes, foreign keys, stored procedures |\n| `.go` | Functions, structs, interfaces, methods, packages, imports |\n| `.rs` | Functions, structs, traits, impls, enums, mods, use statements |\n| `.java`, `.kt` | Classes, interfaces, methods, annotations, packages |\n| `.md` (special) | Headings, lists, code blocks, mermaid diagrams, tables, frontmatter |\n\n**Acceptance Criteria:**\n- [ ] Each symbol stored as a node with: name, kind, signature, line range, hash, docstring\n- [ ] Relationships: `CALLS`, `IMPLEMENTS`, `INHERITS`, `IMPORTS`, `EXPORTS`, `MODIFIES`, `READS`\n- [ ] Incremental parse: only re-parse files whose content hash changed\n- [ ] Parse errors stored as node metadata (not silently dropped)\n\n### 5.3 Documentation Intelligence\n\nFor structured markdown files (`.prd.md`, `.hld.md`, `.lld.md`, `README.md`, `CHANGELOG.md`, `ADR/*.md`):\n\n| Section Type | Extracted Structure |\n|---|---|\n| `## Workflow` / `## Flow` | Ordered step graph with actors and actions |\n| `## Sequence Diagram` | Parsed mermaid `sequenceDiagram` into actor→message→actor edges |\n| `## Flowchart` | Parsed mermaid `flowchart` into decision/action node graph |\n| `## Release Plan` | Timeline with milestones, versions, dates |\n| `## API` | Endpoint → method → params → response schema |\n| `## Architecture` / `## Components` | Component hierarchy with responsibility and tech stack |\n| `## Decision` (ADR) | Context → Decision → Consequences as structured tuple |\n| Standard lists | Typed list items (checkbox, numbered, bullet) with nesting |\n| Tables | Columnar data as records |\n\n**Acceptance Criteria:**\n- [ ] Mermaid blocks parsed into graph nodes, not stored as raw text\n- [ ] Section-level linking: a workflow step can reference a function symbol node\n- [ ] Cross-reference resolution: `[see ModuleX]` in PRD links to Module node in graph\n\n### 5.4 Chainable Query API\n\n```typescript\n// Level 1: Repository\nconst repo = tz.repo('.');\n\n// Level 2: Modules (filterable, chainable)\nconst feModules = repo.modules().filter(m => m.language === 'typescript');\n\n// Level 3: Files within modules\nconst tsFiles = feModules.files().filter(f => f.ext === '.tsx');\n\n// Level 4: Symbols within files\nconst exportedComponents = tsFiles.symbols()\n  .filter(s => s.kind === 'class' && s.isExported && s.extends('React.Component'));\n\n// Cross-cutting queries\nconst dependants = tz.repo('.').symbol('UserService.authenticate')\n  .dependants()                    // who calls this?\n  .withinModule('api-gateway')     // scope it\n  .withKind('function');           // filter\n\nconst impact = tz.repo('.').table('users')\n  .columns()                       // what columns\n  .referencedBy()                  // where are they referenced\n  .files();                        // which files\n\nconst workflow = tz.repo('.').doc('prd.md')\n  .section('Workflow: User Onboarding')\n  .steps()                         // ordered steps\n  .linkedSymbols();                 // what code implements each step\n```\n\n**Acceptance Criteria:**\n- [ ] Every level returns a query builder, not raw data (lazy evaluation)\n- [ ] `.toArray()`, `.toGraph()`, `.toMarkdown()`, `.toJSON()` terminal methods\n- [ ] Queries translate to SurrealDB graph traversal queries\n- [ ] Response < 100ms for repos up to 100K files\n\n### 5.5 Graph Database Storage\n\n- **Engine:** SurrealDB (embedded via RocksDB storage)\n- **Location:** `<project_root>/.tokenzip/db/`\n- **Schema:** Schemaful (strict types per node kind)\n- **Persistence:** WAL-enabled, crash-safe\n\n**Acceptance Criteria:**\n- [ ] `.tokenzip/` added to `.gitignore` automatically\n- [ ] DB size < 10% of source code size for typical repos\n- [ ] Cold start (first full parse) completes at > 500 files/second\n- [ ] Hot start (incremental) completes at > 2000 files/second\n\n### 5.6 Git Hook Integration\n\n```bash\n# Installed via: tokenzip init\n# Creates .git/hooks/pre-commit and .git/hooks/post-commit\n\npre-commit:\n  1. Detect staged files (git diff --cached --name-only)\n  2. Parse changed files with tree-sitter\n  3. Diff new AST against stored graph\n  4. Validate: no broken exports, no orphan imports\n  5. Update graph with new symbol nodes/edges\n  6. If validation fails: warn (configurable: warn/block)\n\npost-commit:\n  1. Store commit metadata (hash, message, author, timestamp)\n  2. Create COMMIT → MODIFIED → FILE edges\n  3. Update file-level git history nodes\n```\n\n**Acceptance Criteria:**\n- [ ] Hook installation is non-destructive (appends to existing hooks)\n- [ ] Hook execution adds < 500ms to commit time for typical changes (< 10 files)\n- [ ] `tokenzip init --no-hooks` flag for CI environments\n- [ ] `tokenzip status` shows graph health (stale files, broken references)\n\n### 5.7 MCP Server\n\n```jsonc\n// Exposed to any MCP-compatible client\n{\n  \"tools\": [\n    \"query_repo_structure\",\n    \"query_module\", \n    \"query_file\",\n    \"query_symbol\",\n    \"get_dependencies\",\n    \"get_dependants\",\n    \"search_symbols\",\n    \"get_git_history\",\n    \"get_workflow\",\n    \"get_impact_analysis\",\n    \"execute_workflow_template\"\n  ],\n  \"resources\": [\n    \"tokenzip://repo/structure\",\n    \"tokenzip://module/{name}/overview\",\n    \"tokenzip://file/{path}/symbols\",\n    \"tokenzip://symbol/{id}/detail\"\n  ]\n}\n```\n\n**Acceptance Criteria:**\n- [ ] MCP server starts in < 200ms\n- [ ] All tools return structured JSON (never raw text dumps)\n- [ ] Token budget aware: responses include `token_count` metadata\n- [ ] Works with Claude Code, Codex, OpenCode, Kilo Code without config changes\n- [ ] Concurrent tool calls supported (SurrealDB connection pooling)\n\n### 5.8 Workflow Templates\n\n| Workflow | Input | Output | Graph Operations |\n|---|---|---|---|\n| **Create Module** | module name, type, dependencies | Scaffolded structure + graph nodes | CREATE module, CREATE files, CREATE IMPORTS edges |\n| **Update Module** | module name, change description | Affected files + symbols list | READ dependants, READ dependents, DIFF graph |\n| **Implement Feature** | feature description, target module | Files to create/modify, symbol gaps | SEARCH related symbols, PATH analysis, IMPACT query |\n| **Upgrade Feature** | feature name, upgrade description | Migration plan + affected modules | SUBGRAPH extraction, DEPENDENCY chain analysis |\n| **Bug Fix** | error message / stack trace | Root cause candidates + impact radius | TRACE call chain, FIND modified symbols in git blame range |\n\n**Acceptance Criteria:**\n- [ ] Each workflow is a deterministic graph query sequence, not LLM-generated\n- [ ] Workflows return structured data that an LLM can act on (not final answers)\n- [ ] Workflow results are cached and timestamped in the graph\n\n## 6. Non-Functional Requirements\n\n| Category | Requirement |\n|---|---|\n| **Performance** | Full index of 100K file repo < 3 minutes; incremental update < 2 seconds |\n| **Memory** | MCP server idle < 50MB; parsing peak < 500MB |\n| **Reliability** | Never corrupt the graph on crash; WAL recovery on restart |\n| **Compatibility** | Node.js 20+, macOS 12+, Ubuntu 22.04+, Windows WSL2 |\n| **Security** | No network calls; all data local; no code execution from graph |\n| **Extensibility** | New language support via plugin (tree-sitter grammar + extractor config) |\n\n## 7. Success Metrics\n\n| Metric | Target |\n|---|---|\n| Copilot context accuracy (relevant vs irrelevant tokens) | > 85% (vs ~40% with text dump) |\n| Time to first useful query after `tokenzip init` | < 5 minutes for 50K file repo |\n| Hook overhead per commit | < 500ms |\n| MCP tool call latency (p95) | < 200ms |\n| Graph size efficiency | < 10% of source size |\n\n## 8. Out of Scope (v2)\n\n- Remote graph synchronization (multi-developer shared graph)\n- LLM-powered code generation (this is a context layer, not a code writer)\n- Runtime analysis (only static analysis via tree-sitter)\n- Binary file parsing (images, compiled artifacts)\n- IDE plugin (VS Code extension is v3)\n\n## 9. Release Phases\n\n| Phase | Scope | Timeline |\n|---|---|---|\n| **Alpha** | Core graph + JS/TS parsing + MCP server + basic queries | Week 1-3 |\n| **Beta** | All languages + git hooks + documentation intelligence | Week 4-6 |\n| **RC** | Workflow templates + chainable API polish + perf tuning | Week 7-8 |\n| **GA** | Stability hardening + plugin system + docs | Week 9-10 |\n\n---\n\n# 🏗️ HLD — High-Level Design\n\n## 1. Architecture Overview\n\nTokenZip v2 is a **local-first, static-analysis graph engine** with four layers:\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│                    LAYER 4: INTEGRATION                         │\n│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌───────────────┐  │\n│  │ Claude   │  │ Codex    │  │ OpenCode │  │ Kilo Code     │  │\n│  │ Code     │  │          │  │          │  │               │  │\n│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └──────┬────────┘  │\n│       │              │              │                │           │\n│       └──────────────┴──────┬───────┴────────────────┘           │\n│                             │ MCP Protocol (stdio/SSE)          │\n├─────────────────────────────┼───────────────────────────────────┤\n│                    LAYER 3: API & QUERY                         │\n│  ┌──────────────────────────┴──────────────────────────────┐   │\n│  │                    MCP Server                            │   │\n│  │  ┌─────────────────┐  ┌──────────────────────────────┐  │   │\n│  │  │  Tool Registry  │  │  Resource Registry            │  │   │\n│  │  └────────┬────────┘  └──────────────┬───────────────┘  │   │\n│  │           └──────────┬───────────────┘                  │   │\n│  │              ┌───────┴────────┐                         │   │\n│  │              │ Chainable Query│                         │   │\n│  │              │ Builder (CQB)  │                         │   │\n│  │              └───────┬────────┘                         │   │\n│  └──────────────────────┼──────────────────────────────────┘   │\n├──────────────────────────┼──────────────────────────────────────┤\n│                    LAYER 2: ENGINE                              │\n│  ┌───────────────────────┼──────────────────────────────────┐  │\n│  │  ┌────────────┐  ┌────┴─────┐  ┌──────────┐  ┌───────┐  │  │\n│  │  │ Tree-Sitter│  │ Markdown │  │ Workflow │  │ Graph │  │  │\n│  │  │ Extractor  │  │ Parser   │  │ Engine   │  │ Query │  │  │\n│  │  │ (per lang) │  │ (struct) │  │ (tpl)    │  │ Planner│  │  │\n│  │  └─────┬──────┘  └────┬─────┘  └────┬─────┘  └───┬───┘  │  │\n│  │        └──────────────┼──────────────┼────────────┘      │  │\n│  │              ┌───────┴──────────────┴───────┐            │  │\n│  │              │     Graph Mutation Engine     │            │  │\n│  │              │  (diff, merge, validate)      │            │  │\n│  │              └───────────────┬───────────────┘            │  │\n│  └──────────────────────────────┼────────────────────────────┘  │\n├──────────────────────────────┼─────────────────────────────────┤\n│                    LAYER 1: STORAGE                            │\n│  ┌───────────────────────────┼─────────────────────────────┐  │\n│  │              ┌────────────┴────────────┐                 │  │\n│  │              │   Storage Abstraction   │                 │  │\n│  │              │   (IStore interface)    │                 │  │\n│  │              └────────────┬────────────┘                 │  │\n│  │        ┌──────────────────┼──────────────────┐           │  │\n│  │  ┌─────┴──────┐    ┌─────┴──────┐    ┌─────┴──────┐     │  │\n│  │  │ SurrealDB  │    │  SQLite    │    │  In-Memory │     │  │\n│  │  │ (primary)  │    │ (fallback) │    │  (tests)   │     │  │\n│  │  └────────────┘    └────────────┘    └────────────┘     │  │\n│  └──────────────────────────────────────────────────────────┘  │\n└─────────────────────────────────────────────────────────────────┘\n\n┌─────────────────────────────────────────────────────────────────┐\n│                    SIDE CHANNELS                                │\n│  ┌──────────────┐  ┌───────────────┐  ┌────────────────────┐  │\n│  │ Git Hooks    │  │ File Watcher  │  │ CLI (tokenzip)     │  │\n│  │ pre-commit   │  │ (optional)    │  │ init, parse, query │  │\n│  │ post-commit  │  │ chokidar      │  │ status, serve      │  │\n│  └──────────────┘  └───────────────┘  └────────────────────┘  │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n## 2. Component Design\n\n### 2.1 Tree-Sitter Extractor\n\n```\n                    ┌─────────────────────┐\n                    │  File Input Stream  │\n                    └──────────┬──────────┘\n                               │\n                    ┌──────────┴──────────┐\n                    │  Language Detector  │\n                    │  (extension + shebang│\n                    │   + .editorconfig)  │\n                    └──────────┬──────────┘\n                               │\n              ┌────────────────┼────────────────┐\n              │                │                │\n     ┌────────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐\n     │ Code Extractor│ │ SQL Extract.│ │ MD Extractor│\n     │ (JS/TS/Py/Go │ │ (Tables,    │ │ (Sections,  │\n     │  /Rs/Java/Kt) │ │  Columns,   │ │  Mermaid,   │\n     │               │ │  FKs, SPs)  │ │  Lists,     │\n     │               │ │             │ │  Tables)    │\n     └───────┬───────┘ └──────┬──────┘ └──────┬──────┘\n             │                │                │\n             └────────────────┼────────────────┘\n                              │\n                    ┌─────────┴──────────┐\n                    │  Symbol Graph      │\n                    │  (nodes + edges)   │\n                    └────────────────────┘\n```\n\n**Key Design Decision:** Extractors produce an **intermediate representation (IR)** — a flat list of `SymbolNode` and `SymbolEdge` objects — regardless of source language. This decouples parsing from storage.\n\n### 2.2 Chainable Query Builder (CQB)\n\n```\nQueryBuilder\n  ├── .repo(path)          → RepoScope\n  │     ├── .modules()     → ModuleScope\n  │     │     ├── .files() → FileScope\n  │     │     │     ├── .symbols() → SymbolScope\n  │     │     │     ├── .tables()  → TableScope\n  │     │     │     └── .sections()→ SectionScope\n  │     │     ├── .dependencies()  → ModuleScope (external deps)\n  │     │     └── .dependants()    → ModuleScope\n  │     ├── .files()       → FileScope (all files, no module filter)\n  │     ├── .symbols()     → SymbolScope (global search)\n  │     ├── .tables()      → TableScope\n  │     └── .docs()        → DocScope\n  ├── .symbol(name)        → SymbolScope (direct lookup)\n  ├── .table(name)         → TableScope\n  ├── .commit(hash)        → CommitScope\n  └── .workflow(name)      → WorkflowScope\n\nEvery Scope has:\n  ├── .filter(predicate)   → same Scope (adds WHERE clause)\n  ├── .sort(field, dir)    → same Scope\n  ├── .limit(n)            → same Scope\n  ├── .offset(n)           → same Scope\n  └── Terminal methods:\n        ├── .toArray()     → SymbolNode[]\n        ├── .toGraph()     → { nodes: [], edges: [] }\n        ├── .toMarkdown()  → string\n        ├── .toJSON()      → string\n        ├── .count()       → number\n        └── .exists()      → boolean\n```\n\n### 2.3 MCP Server Architecture\n\n```\n┌─────────────────────────────────────────────┐\n│              MCP Server                      │\n│                                              │\n│  ┌─────────────────────────────────────┐    │\n│  │         Transport Layer              │    │\n│  │  ┌──────────┐    ┌───────────────┐  │    │\n│  │  │  stdio   │    │  SSE/HTTP     │  │    │\n│  │  │ (default)│    │ (optional)    │  │    │\n│  │  └────┬─────┘    └──────┬────────┘  │    │\n│  └───────┼──────────────────┼───────────┘    │\n│          └──────────┬───────┘                │\n│              ┌─────┴──────┐                  │\n│              │  Protocol  │                  │\n│              │  Handler   │                  │\n│              └─────┬──────┘                  │\n│                    │                         │\n│  ┌─────────────────┼─────────────────────┐  │\n│  │            Tool Dispatcher            │  │\n│  │  ┌──────────┐ ┌──────────┐ ┌────────┐ │  │\n│  │  │ Structure│ │ Search   │ │ Impact │ │  │\n│  │  │ Tools    │ │ Tools    │ │ Tools  │ │  │\n│  │  └────┬─────┘ └────┬─────┘ └───┬────┘ │  │\n│  │       └─────────────┼───────────┘      │  │\n│  │              ┌──────┴──────┐           │  │\n│  │              │    CQB      │           │  │\n│  │              │  (shared)   │           │  │\n│  │              └──────┬──────┘           │  │\n│  └─────────────────────┼──────────────────┘  │\n│                        │                     │\n│  ┌─────────────────────┼──────────────────┐  │\n│  │          Token Budget Manager          │  │\n│  │  - Estimates response token count      │  │\n│  │  - Truncates if over budget            │  │\n│  │  - Prioritizes: symbols > files > mods │  │\n│  └─────────────────────────────────────────┘  │\n└─────────────────────────────────────────────┘\n```\n\n### 2.4 Git Hook Pipeline\n\n```\npre-commit trigger\n       │\n       ▼\n┌──────────────────┐\n│ git diff --cached │\n│ --name-only       │\n└───────┬──────────┘\n        │ staged file paths\n        ▼\n┌──────────────────┐\n│ Content Hash     │  ← SHA256 of file content\n│ Check            │  ← Compare with stored hash\n└───────┬──────────┘\n        │ changed files only\n        ▼\n┌──────────────────┐\n│ Tree-Sitter      │  ← Parallel parse (worker threads)\n│ Batch Parse      │\n└───────┬──────────┘\n        │ new symbol IR\n        ▼\n┌──────────────────┐\n│ Graph Diff       │  ← Old symbols vs new symbols\n│ & Merge          │  ← Update nodes, edges, hashes\n└───────┬──────────┘\n        │\n        ▼\n┌──────────────────┐\n│ Validation       │  ← Check: broken exports, orphan imports,\n│ (optional)       │     missing type references\n└───────┬──────────┘\n        │\n   ┌────┴────┐\n   │         │\n   ▼         ▼\nPASS      FAIL\n   │         │\n   ▼         ▼\nContinue   Warn/Block\nCommit     (configurable)\n```\n\n## 3. Data Model (Graph Schema)\n\n### 3.1 Node Types\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ NODE: repository                                                 │\n│   id:        string (record ID)                                  │\n│   name:      string                                              │\n│   root:      string (absolute path)                              │\n│   created_at: datetime                                           │\n│   updated_at: datetime                                           │\n│   stats:     { files: number, modules: number, symbols: number } │\n└─────────────────────────────────────────────────────────────────┘\n\n┌─────────────────────────────────────────────────────────────────┐\n│ NODE: module                                                     │\n│   id:            string                                          │\n│   name:          string                                          │\n│   path:          string (relative to repo root)                  │\n│   manifest_type: string (package.json | pyproject.toml | ...)    │\n│   language:      string (primary language)                       │\n│   is_root:       bool                                            │\n│   metadata:      { name, version, description, ... }             │\n└─────────────────────────────────────────────────────────────────┘\n\n┌─────────────────────────────────────────────────────────────────┐\n│ NODE: file                                                       │\n│   id:          string                                            │\n│   path:        string (relative to repo root)                    │\n│   module_id:   string (reference to module)                      │\n│   language:    string                                            │\n│   ext:         string                                            │\n│   size_bytes:  number                                            │\n│   content_hash: string (SHA256)                                  │\n│   line_count:  number                                            │\n│   parse_status: string (parsed | partial | failed | skipped)     │\n│   parse_error:  option<string>                                   │\n│   last_parsed: datetime                                          │\n│   git_last_modified: option<datetime>                            │\n│   git_blame_summary: option<{ author, date, commit_count }>      │\n└─────────────────────────────────────────────────────────────────┘\n\n┌─────────────────────────────────────────────────────────────────┐\n│ NODE: symbol (polymorphic by kind)                               │\n│   id:            string                                          │\n│   file_id:       string                                          │\n│   name:          string                                          │\n│   kind:          enum {                                          │\n│     function, method, constructor,                               │\n│     class, interface, type_alias, enum,                          │\n│     variable, constant, property,                                │\n│     parameter, generic_param,                                    │\n│     decorator, annotation,                                       │\n│     table, view, column, index, constraint,                      │\n│     foreign_key, stored_procedure,                               │\n│     import, export, re_export,                                   │\n│     namespace, module_decl,                                      │\n│     section, subsection,                                         │\n│     workflow_step, diagram_node,                                 │\n│     list_item, table_row                                         │\n│   }                                                             │\n│   signature:     option<string>  (full signature text)           │\n│   return_type:   option<string>                                  │\n│   start_line:    number                                          │\n│   end_line:      number                                          │\n│   start_col:     number                                          │\n│   end_col:       number                                          │\n│   docstring:     option<string>                                  │\n│   is_exported:   bool                                            │\n│   is_async:      option<bool>                                    │\n│   is_static:     option<bool>                                    │\n│   visibility:    option<enum { public, private, protected }>     │\n│   modifiers:     array<string>                                   │\n│   parent_symbol_id: option<string> (for nested symbols)          │\n│   metadata:      object (language-specific extras)               │\n│     // For tables: { schema, engine, columns: [...] }           │\n│     // For classes: { implements: [...], extends: ... }         │\n│     // For functions: { params: [...], generics: [...] }        │\n│     // For sections: { level, anchor_id }                       │\n└─────────────────────────────────────────────────────────────────┘\n\n┌─────────────────────────────────────────────────────────────────┐\n│ NODE: commit                                                     │\n│   id:        string                                             │\n│   hash:      string (full SHA)                                  │\n│   short_hash: string (7 char)                                   │\n│   message:   string                                             │\n│   author:    string                                             │\n│   email:     string                                             │\n│   date:      datetime                                           │\n│   branch:    string                                             │\n│   tags:      array<string>                                      │\n└─────────────────────────────────────────────────────────────────┘\n\n┌─────────────────────────────────────────────────────────────────┐\n│ NODE: dependency (external)                                      │\n│   id:          string                                            │\n│   module_id:   string (which module depends on it)               │\n│   name:        string (npm package name, pip package, etc.)      │\n│   version:     string (resolved version)                         │\n│   dev:         bool                                              │\n│   source:      string (npm, pip, cargo, go modules, maven)       │\n└─────────────────────────────────────────────────────────────────┘\n```\n\n### 3.2 Edge Types\n\n```\nEDGE: contains\n  FROM: repository  → TO: module\n  FROM: module      → TO: file\n  FROM: file        → TO: symbol\n  FROM: symbol      → TO: symbol (nested: class → method)\n\nEDGE: imports\n  FROM: file    → TO: file       (file-level import)\n  FROM: module  → TO: module     (module-level dependency)\n  FROM: symbol  → TO: symbol     (symbol-level import)\n  METADATA: { is_type_only: bool, is_default: bool, alias: option<string> }\n\nEDGE: exports\n  FROM: file   → TO: symbol\n  FROM: symbol → TO: symbol       (re-export chain)\n  METADATA: { is_default: bool, is_reexport: bool, alias: option<string> }\n\nEDGE: calls\n  FROM: symbol (function/method) → TO: symbol (function/method)\n  METADATA: { line: number, is_async: bool, call_type: enum { direct, indirect, dynamic } }\n\nEDGE: implements\n  FROM: symbol (class) → TO: symbol (interface)\n  METADATA: { is_partial: bool }\n\nEDGE: inherits\n  FROM: symbol (class/interface) → TO: symbol (class/interface)\n  METADATA: { is_interface_inheritance: bool }\n\nEDGE: modifies\n  FROM: symbol (function) → TO: symbol (variable/table/column)\n\nEDGE: reads\n  FROM: symbol (function) → TO: symbol (variable/table/column)\n\nEDGE: references\n  FROM: symbol → TO: symbol (generic \"uses\" relationship)\n  METADATA: { context: string }\n\nEDGE: depends_on\n  FROM: module → TO: module (transitive closure of imports)\n  FROM: file   → TO: file\n  METADATA: { is_transitive: bool, depth: number }\n\nEDGE: depended_by  (computed reverse of depends_on)\n\nEDGE: modified_in\n  FROM: file   → TO: commit\n  METADATA: { change_type: enum { added, modified, deleted, renamed } }\n\nEDGE: authored_by\n  FROM: file/symbol → TO: commit (latest commit touching this artifact)\n\nEDGE: belongs_to_workflow\n  FROM: symbol → TO: symbol (workflow_step)\n\nEDGE: workflow_transition\n  FROM: symbol (workflow_step) → TO: symbol (workflow_step)\n  METADATA: { condition: option<string>, action: option<string> }\n\nEDGE: diagram_edge\n  FROM: symbol (diagram_node) → TO: symbol (diagram_node)\n  METADATA: { label: string, style: string, type: enum { solid, dashed, dotted, bold } }\n\nEDGE: foreign_key\n  FROM: symbol (column) → TO: symbol (table)\n  METADATA: { constraint_name: string, on_delete: string, on_update: string }\n\nEDGE: column_of\n  FROM: symbol (column/index/constraint) → TO: symbol (table)\n```\n\n### 3.3 Indexes\n\n```\nDEFINE INDEX idx_file_path      ON file   FIELDS path         UNIQUE\nDEFINE INDEX idx_file_hash      ON file   FIELDS content_hash\nDEFINE INDEX idx_file_module    ON file   FIELDS module_id\nDEFINE INDEX idx_symbol_name    ON symbol FIELDS name\nDEFINE INDEX idx_symbol_kind    ON symbol FIELDS kind\nDEFINE INDEX idx_symbol_file    ON symbol FIELDS file_id\nDEFINE INDEX idx_symbol_export  ON symbol FIELDS is_exported\nDEFINE INDEX idx_module_path    ON module FIELDS path          UNIQUE\nDEFINE INDEX idx_commit_hash    ON commit FIELDS hash          UNIQUE\nDEFINE INDEX idx_dep_name       ON dependency FIELDS name, module_id\n```\n\n## 4. Technology Stack\n\n| Component | Technology | Rationale |\n|---|---|---|\n| Runtime | Node.js 20+ (ESM) | Universal, tree-sitter bindings available, MCP SDK native |\n| Tree-Sitter | `tree-sitter` + language grammars | Industry standard, incremental parsing, multi-language |\n| Graph DB | SurrealDB v2 (embedded/RocksDB) | Native graph queries, schemaful, embedded mode, no server |\n| Fallback DB | better-sqlite3 | Zero-config fallback if SurrealDB unavailable |\n| MCP | `@modelcontextprotocol/sdk` | Official SDK, stdio + SSE transport |\n| CLI | `commander` | Battle-tested CLI framework |\n| Git | `simple-git` | Promise-based git operations |\n| File Watch | `chokidar` | Cross-platform, efficient |\n| Logging | `pino` | Structured, fast |\n| Testing | `vitest` + `memfs` | Fast, in-memory FS for unit tests |\n| Bundling | `tsup` | ESM + CJS dual output, tree-shaking |\n| Markdown | `unified` + `remark` + `rehype` | Pluggable markdown AST pipeline |\n| Mermaid | `mermaid` (headless) | Parse mermaid diagrams to structured data |\n\n## 5. Integration Architecture\n\n### 5.1 MCP Integration Points\n\n```\nClaude Code / Codex / OpenCode\n         │\n         │  MCP Protocol (JSON-RPC 2.0 over stdio)\n         │\n    ┌────┴─────┐\n    │ MCP      │\n    │ Server   │\n    └────┬─────┘\n         │\n    ┌────┴──────────────────────────────────┐\n    │              Tool Calls               │\n    │                                       │\n    │  1. query_repo_structure              │\n    │     → Returns module tree + stats     │\n    │                                       │\n    │  2. query_symbol { name, scope }      │\n    │     → Symbol node + edges             │\n    │                                       │\n    │  3. get_impact_analysis { symbol_id } │\n    │     → Dependents + transitive closure │\n    │                                       │\n    │  4. search_symbols { query, filters } │\n    │     → Fuzzy match on name/signature   │\n    │                                       │\n    │  5. get_workflow { doc, section }     │\n    │     → Structured workflow + links     │\n    │                                       │\n    │  6. get_git_history { path, limit }   │\n    │     → Commit chain for file/symbol    │\n    │                                       │\n    │  7. execute_workflow_template {       │\n    │       type, params }                  │\n    │     → Structured analysis result      │\n    │                                       │\n    │  8. get_dependencies { module_id }    │\n    │     → Internal + external deps        │\n    │                                       │\n    │  9. get_dependants { symbol_id }      │\n    │     → Reverse dependency chain        │\n    │                                       │\n    │  10. get_context_for_files {          │\n    │        paths, max_tokens }            │\n    │      → Token-budget-aware context     │\n    │                                       │\n    └───────────────────────────────────────┘\n```\n\n### 5.2 Claude Code MCP Config (auto-generated)\n\n```json\n{\n  \"mcpServers\": {\n    \"tokenzip\": {\n      \"command\": \"npx\",\n      \"args\": [\"tokenzip\", \"serve\", \"--cwd\", \"/path/to/project\"],\n      \"env\": {}\n    }\n  }\n}\n```\n\n## 6. Security Considerations\n\n- **No network**: All data stays local. SurrealDB binds to `127.0.0.1` only if HTTP transport used.\n- **No code execution**: Graph stores metadata only. No eval, no require from stored data.\n- **Path traversal protection**: All file paths resolved and canonicalized before storage.\n- **Git hook safety**: Hooks are read-only from git's perspective (never force-push, never amend).\n- **`.tokenzip/` in `.gitignore`**: Automatically appended, never committed.\n- **Token budget**: MCP responses capped at configurable token limit to prevent context overflow.\n\n## 7. Deployment Model\n\n```\nLocal Developer Machine\n│\n├── ~/.tokenzip/\n│   ├── config.json          # Global config\n│   ├── surrealdb/           # Shared SurrealDB binary (if not system-installed)\n│   └── cache/               # Cross-project cache\n│\n└── <project-root>/\n    ├── .tokenzip/\n    │   ├── db/              # SurrealDB data directory\n    │   │   ├── data.db      # RocksDB storage\n    │   │   └── lock         # Process lock\n    │   ├── config.json      # Project-specific config\n    │   │   ├── languages: [...]\n    │   │   ├── excluded: [...]\n    │   │   ├── hooks: { preCommit: \"warn\" | \"block\" | \"off\" }\n    │   │   └── mcp: { maxTokens: 8000, transport: \"stdio\" }\n    │   └── state.json       # Parse state, last commit, version\n    │\n    ├── .git/\n    │   └── hooks/\n    │       ├── pre-commit   # Appended tokenzip hook\n    │       └── post-commit  # Appended tokenzip hook\n    │\n    └── .gitignore           # Contains .tokenzip/\n```\n\n---\n\n# 🔧 LLD — Low-Level Design\n\n## 1. Module Structure\n\n```\ntokenzip/\n├── src/\n│   ├── index.ts                    # Public API entry point\n│   │\n│   ├── cli/                        # CLI layer\n│   │   ├── index.ts                # Commander setup\n│   │   ├── commands/\n│   │   │   ├── init.ts             # tokenzip init\n│   │   │   ├── parse.ts            # tokenzip parse [--full | --incremental]\n│   │   │   ├── query.ts            # tokenzip query <cqb-expression>\n│   │   │   ├── status.ts           # tokenzip status\n│   │   │   ├── serve.ts            # tokenzip serve [--transport stdio|sse] [--port 3000]\n│   │   │   ├── hooks.ts            # tokenzip hooks install|uninstall\n│   │   │   └── clean.ts            # tokenzip clean\n│   │   └── utils/\n│   │       └── spinner.ts\n│   │\n│   ├── mcp/                        # MCP server layer\n│   │   ├── server.ts               # MCP server creation & setup\n│   │   ├── transport/\n│   │   │   ├── stdio.ts\n│   │   │   └── sse.ts\n│   │   ├── tools/\n│   │   │   ├── registry.ts         # Tool registration\n│   │   │   ├── structure.ts        # query_repo_structure, query_module\n│   │   │   ├── symbol.ts           # query_symbol, search_symbols\n│   │   │   ├── dependency.ts       # get_dependencies, get_dependants\n│   │   │   ├── impact.ts           # get_impact_analysis\n│   │   │   ├── git.ts              # get_git_history\n│   │   │   ├── workflow.ts         # get_workflow, execute_workflow_template\n│   │   │   └── context.ts          # get_context_for_files\n│   │   ├── resources/\n│   │   │   ├── registry.ts\n│   │   │   ├── repo.ts\n│   │   │   ├── module.ts\n│   │   │   ├── file.ts\n│   │   │   └── symbol.ts\n│   │   └── token-budget.ts         # Token estimation & truncation\n│   │\n│   ├── query/                      # Chainable Query Builder\n│   │   ├── builder.ts              # Base QueryBuilder class\n│   │   ├── scopes/\n│   │   │   ├── repo-scope.ts\n│   │   │   ├── module-scope.ts\n│   │   │   ├── file-scope.ts\n│   │   │   ├── symbol-scope.ts\n│   │   │   ├── table-scope.ts\n│   │   │   ├── commit-scope.ts\n│   │   │   ├── doc-scope.ts\n│   │   │   └── workflow-scope.ts\n│   │   ├── filters.ts              # Filter predicate parser\n│   │   ├── translators/\n│   │   │   ├── surrealql.ts        # CQB → SurrealQL translation\n│   │   │   └── sql.ts              # CQB → SQL translation (SQLite fallback)\n│   │   └── types.ts\n│   │\n│   ├── engine/                     # Core engine layer\n│   │   ├── indexer.ts              # Full & incremental indexing orchestrator\n│   │   ├── differ.ts               # Graph diff: old symbols vs new symbols\n│   │   ├── merger.ts               # Merge diff into graph\n│   │   ├── validator.ts            # Reference integrity validation\n│   │   ├── module-detector.ts      # Detect module boundaries\n│   │   └── language-detector.ts    # Detect language from extension + content\n│   │\n│   ├── extractor/                  # Tree-sitter extraction layer\n│   │   ├── base-extractor.ts       # Abstract extractor interface\n│   │   ├── registry.ts             # Language → extractor mapping\n│   │   ├── code/\n│   │   │   ├── javascript.ts       # JS/JSX extractor\n│   │   │   ├── typescript.ts       # TS/TSX extractor\n│   │   │   ├── python.ts\n│   │   │   ├── go.ts\n│   │   │   ├── rust.ts\n│   │   │   ├── java.ts\n│   │   │   └── kotlin.ts\n│   │   ├── sql/\n│   │   │   └── sql.ts              # SQL extractor (tables, columns, FKs)\n│   │   ├── markdown/\n│   │   │   ├── markdown.ts         # Markdown structure extractor\n│   │   │   ├── mermaid.ts          # Mermaid diagram parser\n│   │   │   └── sections.ts         # Section type classifier\n│   │   └── types.ts                # SymbolIR, EdgeIR types\n│   │\n│   ├── storage/                    # Storage abstraction layer\n│   │   ├── interface.ts            # IStore interface\n│   │   ├── surreal/\n│   │   │   ├── connection.ts       # Connection pool & lifecycle\n│   │   │   ├── migrations.ts       # Schema migration\n│   │   │   ├── queries/\n│   │   │   │   ├── nodes.ts\n│   │   │   │   ├── edges.ts\n│   │   │   │   ├── graph.ts\n│   │   │   │   └── search.ts\n│   │   │   └── store.ts            # SurrealStore implements IStore\n│   │   ├── sqlite/\n│   │   │   ├── schema.ts           # Table creation\n│   │   │   ├── queries/\n│   │   │   │   ├── nodes.ts\n│   │   │   │   ├── edges.ts\n│   │   │   │   └── graph.ts\n│   │   │   └── store.ts            # SQLiteStore implements IStore\n│   │   ├── memory/\n│   │   │   └── store.ts            # MemoryStore for testing\n│   │   └── factory.ts              # StoreFactory: config → IStore\n│   │\n│   ├── hooks/                      # Git hook layer\n│   │   ├── installer.ts            # Install hooks into .git/hooks/\n│   │   ├── pre-commit.ts           # Pre-commit logic\n│   │   ├── post-commit.ts          # Post-commit logic\n│   │   └── detector.ts             # Detect staged files\n│   │\n│   ├── workflows/                  # Workflow template engine\n│   │   ├── engine.ts               # Workflow executor\n│   │   ├── registry.ts             # Workflow template registry\n│   │   └── templates/\n│   │       ├── create-module.ts\n│   │       ├── update-module.ts\n│   │       ├── implement-feature.ts\n│   │       ├── upgrade-feature.ts\n│   │       └── bug-fix.ts\n│   │\n│   ├── utils/\n│   │   ├── logger.ts\n│   │   ├── hash.ts                 # Content hashing (SHA256)\n│   │   ├── path.ts                 # Path resolution & normalization\n│   │   ├── tokens.ts               # Token estimation (chars/4 for code)\n│   │   ├── workers.ts              # Worker thread pool for parsing\n│   │   └── version.ts\n│   │\n│   └── types/\n│       ├── graph.ts                # All node & edge types\n│       ├── extractor.ts            # Extractor IR types\n│       ├── query.ts                # Query builder types\n│       └── config.ts               # Configuration types\n│\n├── grammars/                       # Tree-sitter WASM grammars (bundled)\n│   ├── tree-sitter-javascript.wasm\n│   ├── tree-sitter-typescript.wasm\n│   ├── tree-sitter-python.wasm\n│   ├── tree-sitter-go.wasm\n│   ├── tree-sitter-rust.wasm\n│   ├── tree-sitter-java.wasm\n│   ├── tree-sitter-kotlin.wasm\n│   └── tree-sitter-sql.wasm\n│\n├── tests/\n│   ├── unit/\n│   │   ├── extractor/\n│   │   │   ├── javascript.test.ts\n│   │   │   ├── typescript.test.ts\n│   │   │   ├── python.test.ts\n│   │   │   ├── sql.test.ts\n│   │   │   └── markdown.test.ts\n│   │   ├── query/\n│   │   │   └── builder.test.ts\n│   │   ├── engine/\n│   │   │   ├── differ.test.ts\n│   │   │   ├── merger.test.ts\n│   │   │   └── module-detector.test.ts\n│   │   ├── storage/\n│   │   │   └── memory-store.test.ts\n│   │   └── hooks/\n│   │       └── detector.test.ts\n│   ├── integration/\n│   │   ├── full-parse.test.ts\n│   │   ├── incremental-parse.test.ts\n│   │   ├── mcp-server.test.ts\n│   │   └── git-hook.test.ts\n│   ├── fixtures/\n│   │   ├── js-project/\n│   │   ├── ts-monorepo/\n│   │   ├── python-project/\n│   │   ├── sql-project/\n│   │   └── mixed-project/\n│   └── e2e/\n│       └── claude-code.test.ts\n│\n├── package.json\n├── tsconfig.json\n├── tsup.config.ts\n└── vitest.config.ts\n```\n\n## 2. Detailed Component Design\n\n### 2.1 Storage Abstraction (IStore)\n\n```typescript\n// src/storage/interface.ts\n\nimport type { \n  RepositoryNode, ModuleNode, FileNode, SymbolNode, \n  CommitNode, DependencyNode,\n  ContainsEdge, ImportsEdge, ExportsEdge, CallsEdge,\n  ImplementsEdge, InheritsEdge, ModifiesEdge, ReadsEdge,\n  ReferencesEdge, DependsOnEdge, ModifiedInEdge,\n  ForeignKeyEdge, ColumnOfEdge,\n  // ... all edge types\n} from '../types/graph';\n\nexport interface GraphNode {\n  id: string;\n  type: 'repository' | 'module' | 'file' | 'symbol' | 'commit' | 'dependency';\n  [key: string]: unknown;\n}\n\nexport interface GraphEdge {\n  id: string;\n  type: string;\n  from: string;\n  to: string;\n  [key: string]: unknown;\n}\n\nexport interface GraphResult {\n  nodes: GraphNode[];\n  edges: GraphEdge[];\n}\n\nexport interface StoreStats {\n  nodeCount: Record<string, number>;\n  edgeCount: Record<string, number>;\n  dbSizeBytes: number;\n}\n\nexport interface IStore {\n  // Lifecycle\n  initialize(): Promise<void>;\n  close(): Promise<void>;\n  migrate(): Promise<void>;\n  clear(): Promise<void>;\n  stats(): Promise<StoreStats>;\n\n  // Node CRUD\n  createNode<T extends GraphNode>(node: T): Promise<T>;\n  createNodes<T extends GraphNode>(nodes: T[]): Promise<T[]>;\n  getNode<T extends GraphNode>(id: string): Promise<T | null>;\n  getNodes(ids: string[]): Promise<GraphNode[]>;\n  updateNode<T extends GraphNode>(id: string, patch: Partial<T>): Promise<T>;\n  deleteNode(id: string): Promise<void>;\n  deleteNodes(ids: string[]): Promise<void>;\n\n  // Edge CRUD\n  createEdge<T extends GraphEdge>(edge: T): Promise<T>;\n  createEdges<T extends GraphEdge>(edges: T[]): Promise<T[]>;\n  getEdges(from: string, type?: string): Promise<GraphEdge[]>;\n  getEdgesTo(to: string, type?: string): Promise<GraphEdge[]>;\n  deleteEdges(from: string, type?: string): Promise<void>;\n\n  // Graph Queries\n  query(surrealQL: string, vars?: Record<string, unknown>): Promise<unknown[]>;\n  graphTraversal(\n    startId: string,\n    edgeTypes: string[],\n    direction: 'outbound' | 'inbound' | 'both',\n    depth?: number,\n    filter?: string\n  ): Promise<GraphResult>;\n\n  // Bulk Operations\n  batchUpsert(nodes: GraphNode[], edges: GraphEdge[]): Promise<void>;\n  \n  // Search\n  searchNodes(\n    type: string, \n    field: string, \n    query: string, \n    limit?: number\n  ): Promise<GraphNode[]>;\n\n  // Transactions\n  transaction<T>(fn: (store: IStore) => Promise<T>): Promise<T>;\n}\n```\n\n### 2.2 Tree-Sitter Extractor Interface\n\n```typescript\n// src/extractor/base-extractor.ts\n\nimport { Parser, Tree } from 'tree-sitter';\nimport { SymbolIR, EdgeIR } from './types';\n\nexport interface ExtractionResult {\n  symbols: SymbolIR[];\n  edges: EdgeIR[];\n  parseErrors: ParseError[];\n}\n\nexport interface ParseError {\n  line: number;\n  column: number;\n  message: string;\n}\n\nexport interface ExtractorContext {\n  filePath: string;\n  relativePath: string;\n  content: string;\n  contentHash: string;\n  tree: Tree;\n  language: string;\n  moduleId: string;\n}\n\nexport abstract class BaseExtractor {\n  abstract readonly language: string;\n  abstract readonly extensions: string[];\n\n  /**\n   * Extract symbols and edges from a parsed tree.\n   * Called after tree-sitter has parsed the file.\n   */\n  abstract extract(ctx: ExtractorContext): ExtractionResult;\n\n  /**\n   * Post-process extraction results.\n   * Resolve internal references, compute derived edges.\n   * Default implementation does nothing; subclasses can override.\n   */\n  postProcess(\n    symbols: SymbolIR[], \n    edges: EdgeIR[], \n    ctx: ExtractorContext\n  ): { symbols: SymbolIR[]; edges: EdgeIR[] } {\n    return { symbols, edges };\n  }\n\n  /**\n   * Generate a stable ID for a symbol.\n   * Must be deterministic for the same symbol in the same file.\n   */\n  generateSymbolId(\n    filePath: string, \n    symbolName: string, \n    kind: string, \n    startLine: number\n  ): string {\n    // Format: sym:<filepath-hash>:<name>:<kind>:<line>\n    const pathHash = this.hashPath(filePath);\n    return `sym:${pathHash}:${symbolName}:${kind}:${startLine}`;\n  }\n\n  private hashPath(filePath: string): string {\n    // First 8 chars of SHA256 of relative path\n    return createHash('sha256')\n      .update(filePath)\n      .digest('hex')\n      .slice(0, 8);\n  }\n\n  /**\n   * Walk the tree-sitter AST with a visitor pattern.\n   * Utility method for subclasses.\n   */\n  protected walk(\n    node: Parser.SyntaxNode, \n    visitors: Record<string, (node: Parser.SyntaxNode) => void>\n  ): void {\n    const visitor = visitors[node.type];\n    if (visitor) {\n      visitor(node);\n    }\n    for (let i = 0; i < node.childCount; i++) {\n      this.walk(node.child(i)!, visitors);\n    }\n  }\n\n  /**\n   * Extract docstring/JSDoc/comment attached to a node.\n   */\n  protected extractDocstring(node: Parser.SyntaxNode, content: string): string | null {\n    // Look for preceding comment nodes\n    const prev = node.previousNamedSibling;\n    if (prev && (prev.type === 'comment' || prev.type === 'block_comment' \n        || prev.type === 'docstring' || prev.type === 'jsdoc')) {\n      return content.slice(prev.startIndex, prev.endIndex).trim();\n    }\n    return null;\n  }\n}\n```\n\n### 2.3 TypeScript Extractor (Detailed Example)\n\n```typescript\n// src/extractor/code/typescript.ts\n\nimport { BaseExtractor, ExtractorContext, ExtractionResult, SymbolIR, EdgeIR } from '../base-extractor';\n\nexport class TypeScriptExtractor extends BaseExtractor {\n  language = 'typescript';\n  extensions = ['.ts', '.tsx', '.mts', '.cts'];\n\n  extract(ctx: ExtractorContext): ExtractionResult {\n    const symbols: SymbolIR[] = [];\n    const edges: EdgeIR[] = [];\n    const parseErrors: ParseError[] = [];\n\n    // Collect parse errors\n    this.collectErrors(ctx.tree.rootNode, parseErrors, ctx.content);\n\n    // Visit top-level and nested declarations\n    this.walk(ctx.tree.rootNode, {\n      // Functions\n      'function_declaration': (node) => {\n        const name = this.getName(node);\n        if (!name) return;\n        symbols.push({\n          id: this.generateSymbolId(ctx.relativePath, name, 'function', node.startPosition.row + 1),\n          fileId: `file:${ctx.relativePath}`,\n          name,\n          kind: 'function',\n          signature: this.getSignature(node, ctx.content),\n          returnType: this.getReturnType(node),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractDocstring(node, ctx.content),\n          isExported: this.isExported(node),\n          isAsync: this.hasModifier(node, 'async'),\n          isStatic: false,\n          visibility: this.getVisibility(node),\n          modifiers: this.getModifiers(node),\n          metadata: {\n            params: this.extractParams(node, ctx.content),\n            generics: this.extractGenerics(node, ctx.content),\n            typeParams: this.extractTypeParams(node),\n          },\n        });\n      },\n\n      // Arrow functions assigned to variables\n      'variable_declaration': (node) => {\n        const declarator = node.childForFieldName('declarator');\n        if (!declarator) return;\n        const value = declarator.childForFieldName('value');\n        if (!value || (value.type !== 'arrow_function' && value.type !== 'function_expression')) return;\n        \n        const name = this.getName(declarator);\n        if (!name) return;\n\n        const funcKind = value.type === 'arrow_function' ? 'function' : 'function';\n        symbols.push({\n          id: this.generateSymbolId(ctx.relativePath, name, funcKind, node.startPosition.row + 1),\n          fileId: `file:${ctx.relativePath}`,\n          name,\n          kind: funcKind,\n          signature: this.getSignature(value, ctx.content),\n          returnType: this.getReturnType(value),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractDocstring(node, ctx.content),\n          isExported: this.isExported(node),\n          isAsync: this.hasModifier(value, 'async'),\n          isStatic: false,\n          visibility: this.getVisibility(node),\n          modifiers: this.getModifiers(node),\n          metadata: {\n            isArrow: value.type === 'arrow_function',\n            params: this.extractParams(value, ctx.content),\n            generics: this.extractGenerics(value, ctx.content),\n          },\n        });\n      },\n\n      // Classes\n      'class_declaration': (node) => {\n        const name = this.getName(node);\n        if (!name) return;\n        \n        const heritage = this.extractHeritage(node); // extends, implements\n        const symbolId = this.generateSymbolId(ctx.relativePath, name, 'class', node.startPosition.row + 1);\n\n        symbols.push({\n          id: symbolId,\n          fileId: `file:${ctx.relativePath}`,\n          name,\n          kind: 'class',\n          signature: this.getSignature(node, ctx.content),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractDocstring(node, ctx.content),\n          isExported: this.isExported(node),\n          isStatic: false,\n          visibility: this.getVisibility(node),\n          modifiers: this.getModifiers(node),\n          metadata: {\n            extends: heritage.extends,\n            implements: heritage.implements,\n            generics: this.extractGenerics(node, ctx.content),\n          },\n        });\n\n        // Create inheritance edges\n        if (heritage.extends) {\n          edges.push({\n            type: 'inherits',\n            from: symbolId,\n            to: `sym:unknown:${heritage.extends}:class:0`, // resolved later\n            metadata: { is_interface_inheritance: false },\n            isResolved: false,\n          });\n        }\n        for (const impl of heritage.implements) {\n          edges.push({\n            type: 'implements',\n            from: symbolId,\n            to: `sym:unknown:${impl}:interface:0`,\n            metadata: { is_partial: false },\n            isResolved: false,\n          });\n        }\n      },\n\n      // Interfaces\n      'interface_declaration': (node) => {\n        const name = this.getName(node);\n        if (!name) return;\n\n        const extendsList = this.extractInterfaceExtends(node);\n        const symbolId = this.generateSymbolId(ctx.relativePath, name, 'interface', node.startPosition.row + 1);\n\n        symbols.push({\n          id: symbolId,\n          fileId: `file:${ctx.relativePath}`,\n          name,\n          kind: 'interface',\n          signature: this.getSignature(node, ctx.content),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractDocstring(node, ctx.content),\n          isExported: this.isExported(node),\n          isStatic: false,\n          visibility: 'public',\n          modifiers: this.getModifiers(node),\n          metadata: {\n            extends: extendsList,\n            generics: this.extractGenerics(node, ctx.content),\n            members: this.extractInterfaceMembers(node, ctx.content, ctx.relativePath),\n          },\n        });\n\n        for (const ext of extendsList) {\n          edges.push({\n            type: 'inherits',\n            from: symbolId,\n            to: `sym:unknown:${ext}:interface:0`,\n            metadata: { is_interface_inheritance: true },\n            isResolved: false,\n          });\n        }\n      },\n\n      // Type aliases\n      'type_alias_declaration': (node) => {\n        const name = this.getName(node);\n        if (!name) return;\n        symbols.push({\n          id: this.generateSymbolId(ctx.relativePath, name, 'type_alias', node.startPosition.row + 1),\n          fileId: `file:${ctx.relativePath}`,\n          name,\n          kind: 'type_alias',\n          signature: this.getTypeAliasBody(node, ctx.content),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractDocstring(node, ctx.content),\n          isExported: this.isExported(node),\n          isStatic: false,\n          visibility: 'public',\n          modifiers: [],\n          metadata: {\n            generics: this.extractGenerics(node, ctx.content),\n          },\n        });\n      },\n\n      // Enums\n      'enum_declaration': (node) => {\n        const name = this.getName(node);\n        if (!name) return;\n        const members = this.extractEnumMembers(node, ctx.content);\n        symbols.push({\n          id: this.generateSymbolId(ctx.relativePath, name, 'enum', node.startPosition.row + 1),\n          fileId: `file:${ctx.relativePath}`,\n          name,\n          kind: 'enum',\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractDocstring(node, ctx.content),\n          isExported: this.isExported(node),\n          isStatic: false,\n          visibility: 'public',\n          modifiers: this.getModifiers(node),\n          metadata: { members },\n        });\n      },\n\n      // Imports (file-level)\n      'import_statement': (node) => {\n        const importInfo = this.extractImport(node, ctx.content);\n        if (!importInfo) return;\n        \n        // Store as symbol for tracking\n        symbols.push({\n          id: this.generateSymbolId(ctx.relativePath, importInfo.source, 'import', node.startPosition.row + 1),\n          fileId: `file:${ctx.relativePath}`,\n          name: importInfo.source,\n          kind: 'import',\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          isExported: false,\n          modifiers: [],\n          metadata: {\n            source: importInfo.source,\n            specifiers: importInfo.specifiers,\n            isTypeOnly: importInfo.isTypeOnly,\n            isDefault: importInfo.isDefault,\n          },\n        });\n\n        // Create import edge\n        edges.push({\n          type: 'imports',\n          from: `file:${ctx.relativePath}`,\n          to: `file:${this.resolveImportPath(ctx.relativePath, importInfo.source)}`,\n          metadata: {\n            is_type_only: importInfo.isTypeOnly,\n            is_default: importInfo.isDefault,\n            specifiers: importInfo.specifiers,\n          },\n          isResolved: false,\n        });\n      },\n\n      // Export statements\n      'export_statement': (node) => {\n        // Handle: export { foo, bar } from './module'\n        const exportInfo = this.extractReExport(node, ctx.content);\n        if (exportInfo) {\n          for (const spec of exportInfo.specifiers) {\n            edges.push({\n              type: 'exports',\n              from: `file:${ctx.relativePath}`,\n              to: `file:${this.resolveImportPath(ctx.relativePath, exportInfo.source)}`,\n              metadata: {\n                is_reexport: true,\n                is_default: spec.isDefault,\n                alias: spec.alias,\n                name: spec.name,\n              },\n              isResolved: false,\n            });\n          }\n        }\n      },\n\n      // Method definitions inside classes\n      'method_definition': (node) => {\n        // This is handled inside class_declaration visitor\n        // We capture it there for parent_symbol_id linking\n      },\n\n      // Property definitions inside classes\n      'public_field_definition': (node) => {\n        // Handled inside class_declaration\n      },\n    });\n\n    // Post-process: resolve parent_symbol_id for nested symbols\n    // Post-process: mark exported symbols\n    const processed = this.postProcess(symbols, edges, ctx);\n\n    return {\n      symbols: processed.symbols,\n      edges: processed.edges,\n      parseErrors,\n    };\n  }\n\n  // ... helper methods (getName, getSignature, extractParams, etc.)\n  // Each is ~10-20 lines using tree-sitter child navigation\n}\n```\n\n### 2.4 SQL Extractor\n\n```typescript\n// src/extractor/sql/sql.ts\n\nexport class SQLExtractor extends BaseExtractor {\n  language = 'sql';\n  extensions = ['.sql'];\n\n  extract(ctx: ExtractorContext): ExtractionResult {\n    const symbols: SymbolIR[] = [];\n    const edges: EdgeIR[] = [];\n    const parseErrors: ParseError[] = [];\n\n    this.walk(ctx.tree.rootNode, {\n      'create_table': (node) => {\n        const tableName = this.getTableName(node);\n        if (!tableName) return;\n        \n        const tableId = this.generateSymbolId(\n          ctx.relativePath, tableName, 'table', node.startPosition.row + 1\n        );\n\n        // Extract columns\n        const columns = this.extractColumns(node, ctx.content, ctx.relativePath, tableId);\n        const constraints = this.extractConstraints(node, ctx.content, ctx.relativePath, tableId);\n        const indexes = this.extractIndexes(node, ctx.content, ctx.relativePath, tableId);\n\n        symbols.push({\n          id: tableId,\n          fileId: `file:${ctx.relativePath}`,\n          name: tableName,\n          kind: 'table',\n          signature: this.getTableSignature(node, ctx.content),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractTableComment(node, ctx.content),\n          isExported: false,\n          modifiers: [],\n          metadata: {\n            schema: this.getSchemaName(node),\n            engine: this.getEngine(node),\n            columns: columns.map(c => c.name),\n            columnCount: columns.length,\n          },\n        });\n\n        symbols.push(...columns, ...constraints, ...indexes);\n\n        // Create column_of edges\n        for (const col of columns) {\n          edges.push({ type: 'column_of', from: col.id, to: tableId });\n        }\n        for (const idx of indexes) {\n          edges.push({ type: 'column_of', from: idx.id, to: tableId });\n        }\n        for (const con of constraints) {\n          edges.push({ type: 'column_of', from: con.id, to: tableId });\n        }\n\n        // Extract foreign keys and create FK edges\n        const fks = this.extractForeignKeys(node, ctx.content);\n        for (const fk of fks) {\n          const fromColId = this.generateSymbolId(\n            ctx.relativePath, fk.column, 'column', 0 // approximate\n          );\n          const toTableId = `sym:unknown:${fk.refTable}:table:0`;\n          edges.push({\n            type: 'foreign_key',\n            from: fromColId,\n            to: toTableId,\n            metadata: {\n              constraint_name: fk.name,\n              on_delete: fk.onDelete,\n              on_update: fk.onUpdate,\n              ref_column: fk.refColumn,\n            },\n            isResolved: false,\n          });\n        }\n      },\n\n      'create_view': (node) => {\n        const viewName = this.getViewName(node);\n        if (!viewName) return;\n        symbols.push({\n          id: this.generateSymbolId(ctx.relativePath, viewName, 'view', node.startPosition.row + 1),\n          fileId: `file:${ctx.relativePath}`,\n          name: viewName,\n          kind: 'view',\n          signature: this.getViewQuery(node, ctx.content),\n          startLine: node.startPosition.row + 1,\n          endLine: node.endPosition.row + 1,\n          startCol: node.startPosition.column,\n          endCol: node.endPosition.column,\n          docstring: this.extractViewComment(node, ctx.content),\n          isExported: false,\n          modifiers: [],\n          metadata: { schema: this.getSchemaName(node) },\n        });\n      },\n\n      'create_procedure': (node) => {\n        // Stored procedures / functions\n      },\n    });\n\n    return { symbols, edges, parseErrors };\n  }\n}\n```\n\n### 2.5 Markdown Extractor\n\n```typescript\n// src/extractor/markdown/markdown.ts\n\nimport { unified } from 'unified';\nimport remarkParse from 'remark-parse';\nimport remarkGfm from 'remark-gfm';\nimport { visit } from 'unist-util-visit';\nimport { Root, Heading, Code, List, Table, ListItem } from 'mdast';\n\nexport class MarkdownExtractor extends BaseExtractor {\n  language = 'markdown';\n  extensions = ['.md', '.mdx', '.markdown'];\n\n  extract(ctx: ExtractorContext): ExtractionResult {\n    const symbols: SymbolIR[] = [];\n    const edges: EdgeIR[] = [];\n\n    const tree = unified()\n      .use(remarkParse)\n      .use(remarkGfm)\n      .parse(ctx.content) as Root;\n\n    let currentSection: string | null = null;\n    let sectionCounter = 0;\n    let workflowStepCounter = 0;\n    let diagramNodeCounter = 0;\n\n    visit(tree, (node) => {\n      // Headings → sections\n      if (node.type === 'heading') {\n        const heading = node as Heading;\n        const text = this.getTextContent(heading);\n        const level = heading.depth;\n        const sectionId = this.generateSymbolId(\n          ctx.relativePath, text, 'section', heading.position?.start.line || 0\n        );\n\n        const sectionSymbol: SymbolIR = {\n          id: sectionId,\n          fileId: `file:${ctx.relativePath}`,\n          name: text,\n          kind: 'section',\n          startLine: heading.position?.start.line || 0,\n          endLine: heading.position?.end.line || 0,\n          startCol: heading.position?.start.column || 0,\n          endCol: heading.position?.end.column || 0,\n          isExported: false,\n          modifiers: [],\n          metadata: {\n            level,\n            anchor_id: this.slugify(text),\n            section_type: this.classifySection(text),\n          },\n        };\n        symbols.push(sectionSymbol);\n\n        // Link to parent section\n        if (currentSection && level > 1) {\n          edges.push({\n            type: 'contains',\n            from: currentSection,\n            to: sectionId,\n          });\n        }\n        currentSection = sectionId;\n        sectionCounter++;\n      }\n\n      // Code blocks → check for mermaid\n      if (node.type === 'code') {\n        const code = node as Code;\n        if (code.lang === 'mermaid' && code.value) {\n          const diagramResult = this.parseMermaid(code.value, ctx);\n          symbols.push(...diagramResult.symbols);\n          edges.push(...diagramResult.edges);\n\n          // Link diagram to current section\n          if (currentSection) {\n            for (const sym of diagramResult.symbols) {\n              edges.push({ type: 'contains', from: currentSection, to: sym.id });\n            }\n          }\n        }\n      }\n\n      // Lists → structured list items\n      if (node.type === 'list') {\n        const list = node as List;\n        this.extractListItems(list, symbols, edges, ctx, currentSection);\n      }\n\n      // Tables → structured rows\n      if (node.type === 'table') {\n        const table = node as Table;\n        const tableResult = this.extractTable(table, ctx, currentSection);\n        symbols.push(...tableResult.symbols);\n        edges.push(...tableResult.edges);\n      }\n    });\n\n    return { symbols, edges, parseErrors: [] };\n  }\n\n  private classifySection(heading: string): string {\n    const lower = heading.toLowerCase();\n    if (/workflow|flow|process|pipeline/.test(lower)) return 'workflow';\n    if (/sequence\\s*diagram/.test(lower)) return 'sequence_diagram';\n    if (/flowchart/.test(lower)) return 'flowchart';\n    if (/release\\s*plan|roadmap|timeline/.test(lower)) return 'release_plan';\n    if (/api|endpoint/.test(lower)) return 'api';\n    if (/architecture|component|system\\s*design/.test(lower)) return 'architecture';\n    if (/decision|adr/.test(lower)) return 'decision';\n    if (/requirement|user\\s*story|acceptance/.test(lower)) return 'requirement';\n    return 'general';\n  }\n\n  private parseMermaid(mermaidCode: string, ctx: ExtractorContext): \n    { symbols: SymbolIR[]; edges: EdgeIR[] } {\n    \n    const symbols: SymbolIR[] = [];\n    const edges: EdgeIR[] = [];\n\n    // Detect diagram type\n    const typeMatch = mermaidCode.match(/^(sequenceDiagram|flowchart\\s+\\w+|stateDiagram|erDiagram|classDiagram|gantt)/m);\n    const diagramType = typeMatch?.[1] || 'unknown';\n\n    if (diagramType === 'sequenceDiagram') {\n      return this.parseSequenceDiagram(mermaidCode, ctx);\n    }\n    if (diagramType.startsWith('flowchart')) {\n      return this.parseFlowchart(mermaidCode, ctx);\n    }\n    if (diagramType === 'erDiagram') {\n      return this.parseERDiagram(mermaidCode, ctx);\n    }\n    if (diagramType === 'classDiagram') {\n      return this.parseClassDiagram(mermaidCode, ctx);\n    }\n\n    // Fallback: store as raw diagram node\n    symbols.push({\n      id: this.generateSymbolId(ctx.relativePath, `diagram-${Date.now()}`, 'section', 0),\n      fileId: `file:${ctx.relativePath}`,\n      name: `Mermaid ${diagramType}`,\n      kind: 'section',\n      startLine: 0,\n      endLine: 0,\n      startCol: 0,\n      endCol: 0,\n      isExported: false,\n      modifiers: [],\n      metadata: { diagram_type: diagramType, raw: mermaidCode },\n    });\n\n    return { symbols, edges };\n  }\n\n  private parseSequenceDiagram(code: string, ctx: ExtractorContext): \n    { symbols: SymbolIR[]; edges: EdgeIR[] } {\n    // Parse:\n    //   participant A as Actor A\n    //   A->>B: Message\n    //   B-->>A: Response\n    //\n    // Creates: diagram_node per participant\n    // Creates: diagram_edge per message (with label, style)\n    \n    const symbols: SymbolIR[] = [];\n    const edges: EdgeIR[] = [];\n    const participants = new Map<string, string>(); // alias → full name\n    const baseLine = 0; // Would need actual line from parent\n\n    const participantRe = /^participant\\s+(\\w+)(?:\\s+as\\s+(.+))?$/gm;\n    let match;\n    while ((match = participantRe.exec(code)) !== null) {\n      const alias = match[1];\n      const fullName = match[2] || alias;\n      participants.set(alias, fullName);\n      \n      const id = this.generateSymbolId(ctx.relativePath, alias, 'diagram_node', baseLine);\n      symbols.push({\n        id,\n        fileId: `file:${ctx.relativePath}`,\n        name: fullName,\n        kind: 'diagram_node',\n        startLine: baseLine,\n        endLine: baseLine,\n        startCol: 0,\n        endCol: 0,\n        isExported: false,\n        modifiers: [],\n        metadata: {\n          diagram_type: 'sequence_diagram',\n          role: 'participant',\n          alias,\n        },\n      });\n    }\n\n    // Parse messages: A->>B: text  or  A-->>B: text\n    const msgRe = /^(\\w+)(->>|-->>|->|-->)\\s*(\\w+):\\s*(.+)$/gm;\n    let msgMatch;\n    let msgCounter = 0;\n    while ((msgMatch = msgRe.exec(code)) !== null) {\n      const fromAlias = msgMatch[1];\n      const arrowStyle = msgMatch[2];\n      const toAlias = msgMatch[3];\n      const message = msgMatch[4];\n\n      const fromId = this.generateSymbolId(ctx.relativePath, fromAlias, 'diagram_node', baseLine);\n      const toId = this.generateSymbolId(ctx.relativePath, toAlias, 'diagram_node', baseLine);\n\n      // Register participants if not explicitly declared\n      if (!participants.has(fromAlias)) {\n        participants.set(fromAlias, fromAlias);\n        symbols.push({\n          id: fromId,\n          fileId: `file:${ctx.relativePath}`,\n          name: fromAlias,\n          kind: 'diagram_node',\n          startLine: baseLine, endLine: baseLine,\n          startCol: 0, endCol: 0,\n          isExported: false, modifiers: [],\n          metadata: { diagram_type: 'sequence_diagram', role: 'participant', alias: fromAlias },\n        });\n      }\n      if (!participants.has(toAlias)) {\n        participants.set(toAlias, toAlias);\n        symbols.push({\n          id: toId,\n          fileId: `file:${ctx.relativePath}`,\n          name: toAlias,\n          kind: 'diagram_node',\n          startLine: baseLine, endLine: baseLine,\n          startCol: 0, endCol: 0,\n          isExported: false, modifiers: [],\n          metadata: { diagram_type: 'sequence_diagram', role: 'participant', alias: toAlias },\n        });\n      }\n\n      edges.push({\n        type: 'diagram_edge',\n        from: fromId,\n        to: toId,\n        metadata: {\n          label: message,\n          style: arrowStyle === '->>' ? 'solid' : arrowStyle === '-->>' ? 'dashed' : 'dotted',\n          type: 'solid',\n          sequence: msgCounter++,\n          is_response: arrowStyle.includes('--'),\n        },\n      });\n    }\n\n    return { symbols, edges };\n  }\n\n  // ... parseFlowchart, parseERDiagram, parseClassDiagram, extractListItems, extractTable\n}\n```\n\n### 2.6 Chainable Query Builder — Core\n\n```typescript\n// src/query/builder.ts\n\nimport { IStore } from '../storage/interface';\nimport { GraphNode, GraphEdge, GraphResult } from '../types/graph';\nimport { RepoScope } from './scopes/repo-scope';\n\nexport type SortDirection = 'asc' | 'desc';\nexport type TerminalFormat = 'array' | 'graph' | 'markdown' | 'json';\n\nexport interface FilterPredicate {\n  field: string;\n  op: 'eq' | 'neq' | 'gt' | 'gte' | 'lt' | 'lte' | 'contains' | 'matches' | 'in' | 'exists';\n  value: unknown;\n}\n\nexport abstract class QueryScope<T extends QueryScope<T>> {\n  protected filters: FilterPredicate[] = [];\n  protected sortField: string | null = null;\n  protected sortDir: SortDirection = 'asc';\n  protected limitCount: number | null = null;\n  protected offsetCount: number = 0;\n\n  constructor(protected store: IStore, protected repoPath: string) {}\n\n  filter(predicate: FilterPredicate | ((item: GraphNode) => boolean)): T {\n    const clone = this.clone();\n    if (typeof predicate === 'function') {\n      // Function filters are applied post-hoc (for in-memory operations)\n      clone.filters.push({ field: '_func', op: 'eq', value: predicate } as any);\n    } else {\n      clone.filters.push(predicate);\n    }\n    return clone as T;\n  }\n\n  // Shorthand filters\n  eq(field: string, value: unknown): T { return this.filter({ field, op: 'eq', value }); }\n  neq(field: string, value: unknown): T { return this.filter({ field, op: 'neq', value }); }\n  contains(field: string, value: string): T { return this.filter({ field, op: 'contains', value }); }\n  matches(field: string, pattern: string): T { return this.filter({ field, op: 'matches', value: pattern }); }\n  in(field: string, values: unknown[]): T { return this.filter({ field, op: 'in', value: values }); }\n\n  sort(field: string, dir: SortDirection = 'asc'): T {\n    const clone = this.clone();\n    clone.sortField = field;\n    clone.sortDir = dir;\n    return clone as T;\n  }\n\n  limit(n: number): T {\n    const clone = this.clone();\n    clone.limitCount = n;\n    return clone as T;\n  }\n\n  offset(n: number): T {\n    const clone = this.clone();\n    clone.offsetCount = n;\n    return clone as T;\n  }\n\n  // Terminal methods\n  async toArray(): Promise<GraphNode[]> {\n    const result = await this.execute();\n    return this.applyPostFilters(result.nodes as GraphNode[]);\n  }\n\n  async toGraph(): Promise<GraphResult> {\n    const result = await this.execute();\n    return {\n      nodes: this.applyPostFilters(result.nodes as GraphNode[]),\n      edges: result.edges as GraphEdge[],\n    };\n  }\n\n  async toMarkdown(): Promise<string> {\n    const nodes = await this.toArray();\n    return this.formatAsMarkdown(nodes);\n  }\n\n  async toJSON(): Promise<string> {\n    const result = await this.toGraph();\n    return JSON.stringify(result, null, 2);\n  }\n\n  async count(): Promise<number> {\n    const nodes = await this.toArray();\n    return nodes.length;\n  }\n\n  async exists(): Promise<boolean> {\n    const count = await this.count();\n    return count > 0;\n  }\n\n  // Abstract: each scope implements its own query translation\n  protected abstract execute(): Promise<{ nodes: unknown[]; edges: unknown[] }>;\n  protected abstract clone(): T;\n  protected abstract formatAsMarkdown(nodes: GraphNode[]): string;\n\n  protected applyPostFilters(nodes: GraphNode[]): GraphNode[] {\n    return nodes.filter(node => {\n      for (const f of this.filters) {\n        if (f.field === '_func') continue; // Skip function filters for DB\n        const val = (node as any)[f.field];\n        if (!this.evaluateFilter(val, f)) return false;\n      }\n      // Apply function filters\n      for (const f of this.filters) {\n        if (f.field === '_func') {\n          if (!(f.value as Function)(node)) return false;\n        }\n      }\n      return true;\n    });\n  }\n\n  private evaluateFilter(val: unknown, f: FilterPredicate): boolean {\n    switch (f.op) {\n      case 'eq': return val === f.value;\n      case 'neq': return val !== f.value;\n      case 'contains': return typeof val === 'string' && val.includes(f.value as string);\n      case 'matches': return typeof val === 'string' && new RegExp(f.value as string).test(val);\n      case 'in': return Array.isArray(f.value) && f.value.includes(val);\n      case 'exists': return val !== null && val !== undefined;\n      case 'gt': return typeof val === 'number' && val > (f.value as number);\n      case 'gte': return typeof val === 'number' && val >= (f.value as number);\n      case 'lt': return typeof val === 'number' && val < (f.value as number);\n      case 'lte': return typeof val === 'number' && val <= (f.value as number);\n      default: return true;\n    }\n  }\n}\n\n// Public API entry point\nexport function createQuery(store: IStore, repoPath: string): RepoScope {\n  return new RepoScope(store, repoPath);\n}\n```\n\n### 2.7 RepoScope (Top-Level)\n\n```typescript\n// src/query/scopes/repo-scope.ts\n\nimport { QueryScope } from '../builder';\nimport { IStore } from '../../storage/interface';\nimport { GraphNode } from '../../types/graph';\nimport { ModuleScope } from './module-scope';\nimport { FileScope } from './file-scope';\nimport { SymbolScope } from './symbol-scope';\n\nexport class RepoScope extends QueryScope<RepoScope> {\n  protected async execute(): Promise<{ nodes: unknown[]; edges: unknown[] }> {\n    const query = `\n      SELECT * FROM repository \n      WHERE root = $repoPath\n      LIMIT 1\n    `;\n    const nodes = await this.store.query(query, { repoPath: this.repoPath });\n    return { nodes, edges: [] };\n  }\n\n  protected clone(): RepoScope {\n    return new RepoScope(this.store, this.repoPath);\n  }\n\n  protected formatAsMarkdown(nodes: GraphNode[]): string {\n    if (nodes.length === 0) return 'Repository not indexed.';\n    const repo = nodes[0];\n    const stats = repo.stats as any;\n    return [\n      `# Repository: ${repo.name}`,\n      ``,\n      `- **Path:** ${repo.root}`,\n      `- **Files:** ${stats?.files ?? 'N/A'}`,\n      `- **Modules:** ${stats?.modules ?? 'N/A'}`,\n      `- **Symbols:** ${stats?.symbols ?? 'N/A'}`,\n      `- **Last Indexed:** ${repo.updated_at}`,\n    ].join('\\n');\n  }\n\n  // Navigation to sub-scopes\n  modules(): ModuleScope {\n    return new ModuleScope(this.store, this.repoPath, null);\n  }\n\n  files(): FileScope {\n    return new FileScope(this.store, this.repoPath, null);\n  }\n\n  symbols(): SymbolScope {\n    return new SymbolScope(this.store, this.repoPath, null);\n  }\n\n  docs(): DocScope {\n    return new DocScope(this.store, this.repoPath, null);\n  }\n\n  // Convenience: direct symbol lookup\n  symbol(name: string): SymbolScope {\n    return new SymbolScope(this.store, this.repoPath, null)\n      .eq('name', name);\n  }\n\n  table(name: string): TableScope {\n    return new TableScope(this.store, this.repoPath, null)\n      .eq('name', name);\n  }\n\n  commit(hash: string): CommitScope {\n    return new CommitScope(this.store, this.repoPath, null)\n      .eq('hash', hash);\n  }\n}\n```\n\n### 2.8 SymbolScope (With Graph Traversal)\n\n```typescript\n// src/query/scopes/symbol-scope.ts\n\nimport { QueryScope } from '../builder';\nimport { IStore } from '../../storage/interface';\nimport { GraphNode, GraphEdge } from '../../types/graph';\n\nexport class SymbolScope extends QueryScope<SymbolScope> {\n  constructor(\n    store: IStore,\n    repoPath: string,\n    private moduleId: string | null\n  ) {\n    super(store, repoPath);\n  }\n\n  protected async execute(): Promise<{ nodes: unknown[]; edges: unknown[] }> {\n    let query = 'SELECT * FROM symbol';\n    const vars: Record<string, unknown> = {};\n    const conditions: string[] = [];\n\n    if (this.moduleId) {\n      // Join through file to filter by module\n      query = `\n        SELECT symbol.*, file.path as file_path, file.module_id \n        FROM symbol \n        INNER JOIN file ON symbol.file_id = file.id\n      `;\n      conditions.push('file.module_id = $moduleId');\n      vars.moduleId = this.moduleId;\n    }\n\n    // Apply filters\n    for (const f of this.filters) {\n      if (f.field === '_func') continue;\n      const param = `f_${f.field}`;\n      switch (f.op) {\n        case 'eq': conditions.push(`symbol.${f.field} = $${param}`); break;\n        case 'neq': conditions.push(`symbol.${f.field} != $${param}`); break;\n        case 'contains': conditions.push(`string::contains(symbol.${f.field}, $${param})`); break;\n        case 'matches': conditions.push(`string::matches(symbol.${f.field}, $${param})`); break;\n        case 'in': conditions.push(`symbol.${f.field} IN $${param}`); break;\n        case 'exists': conditions.push(`symbol.${f.field} != NONE`); break;\n      }\n      vars[param] = f.value;\n    }\n\n    if (conditions.length > 0) {\n      query += ` WHERE ${conditions.join(' AND ')}`;\n    }\n\n    if (this.sortField) {\n      query += ` ORDER BY symbol.${this.sortField} ${this.sortDir.toUpperCase()}`;\n    }\n\n    if (this.limitCount !== null) {\n      query += ` LIMIT ${this.limitCount}`;\n    }\n    if (this.offsetCount > 0) {\n      query += ` START ${this.offsetCount}`;\n    }\n\n    const nodes = await this.store.query(query, vars);\n    return { nodes, edges: [] };\n  }\n\n  // Graph traversal methods\n  async dependants(): Promise<SymbolScope> {\n    const symbols = await this.toArray();\n    if (symbols.length === 0) return this;\n    \n    const ids = symbols.map(s => s.id);\n    const result = await this.store.graphTraversal(\n      ids[0], // Start from first symbol\n      ['calls', 'imports', 'references'],\n      'inbound',\n      10, // max depth\n      undefined\n    );\n    \n    // Return new scope with traversed nodes\n    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);\n    // Store pre-computed result\n    (newScope as any)._precomputedNodes = result.nodes;\n    (newScope as any)._precomputedEdges = result.edges;\n    return newScope;\n  }\n\n  async dependencies(): Promise<SymbolScope> {\n    const symbols = await this.toArray();\n    if (symbols.length === 0) return this;\n    \n    const result = await this.store.graphTraversal(\n      symbols[0].id,\n      ['calls', 'imports', 'references'],\n      'outbound',\n      10,\n      undefined\n    );\n    \n    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);\n    (newScope as any)._precomputedNodes = result.nodes;\n    (newScope as any)._precomputedEdges = result.edges;\n    return newScope;\n  }\n\n  async callers(): Promise<SymbolScope> {\n    const symbols = await this.toArray();\n    if (symbols.length === 0) return this;\n    \n    const result = await this.store.graphTraversal(\n      symbols[0].id,\n      ['calls'],\n      'inbound',\n      10,\n      undefined\n    );\n    \n    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);\n    (newScope as any)._precomputedNodes = result.nodes;\n    (newScope as any)._precomputedEdges = result.edges;\n    return newScope;\n  }\n\n  async callees(): Promise<SymbolScope> {\n    const symbols = await this.toArray();\n    if (symbols.length === 0) return this;\n    \n    const result = await this.store.graphTraversal(\n      symbols[0].id,\n      ['calls'],\n      'outbound',\n      10,\n      undefined\n    );\n    \n    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);\n    (newScope as any)._precomputedNodes = result.nodes;\n    (newScope as any)._precomputedEdges = result.edges;\n    return newScope;\n  }\n\n  // Navigate to containing file\n  async file(): Promise<FileScope> {\n    const symbols = await this.toArray();\n    if (symbols.length === 0) return new FileScope(this.store, this.repoPath, null);\n    const fileId = (symbols[0] as any).file_id;\n    const fileScope = new FileScope(this.store, this.repoPath, null);\n    (fileScope as any)._precomputedFileId = fileId;\n    return fileScope;\n  }\n\n  protected clone(): SymbolScope {\n    return new SymbolScope(this.store, this.repoPath, this.moduleId);\n  }\n\n  protected formatAsMarkdown(nodes: GraphNode[]): string {\n    if (nodes.length === 0) return 'No symbols found.';\n    return nodes.map(n => {\n      const s = n as any;\n      const exportTag = s.is_exported ? 'exported' : 'internal';\n      const location = s.file_path ? `(${s.file_path}:${s.start_line})` : `(${s.start_line})`;\n      return `- **${s.name}** [${s.kind}] [${exportTag}] ${location}${s.signature ? `\\n  \\`${s.signature}\\`` : ''}${s.docstring ? `\\n  > ${s.docstring.split('\\n')[0]}` : ''}`;\n    }).join('\\n');\n  }\n}\n```\n\n### 2.9 MCP Tool Implementation Example\n\n```typescript\n// src/mcp/tools/impact.ts\n\nimport { Tool } from '@modelcontextprotocol/sdk/types.js';\nimport { IStore } from '../../storage/interface';\nimport { createQuery } from '../../query/builder';\nimport { TokenBudgetManager } from '../token-budget';\n\nexport function createImpactAnalysisTool(store: IStore, repoPath: string, budget: TokenBudgetManager): Tool {\n  return {\n    name: 'get_impact_analysis',\n    description: `Analyze the impact of changing a symbol. Returns all direct and transitive dependants — functions that call it, files that import it, modules that depend on it. Use this before making changes to understand blast radius.`,\n    inputSchema: {\n      type: 'object',\n      properties: {\n        symbol_name: {\n          type: 'string',\n          description: 'Name of the symbol to analyze',\n        },\n        symbol_kind: {\n          type: 'string',\n          enum: ['function', 'class', 'interface', 'type_alias', 'variable', 'table', 'column'],\n          description: 'Kind of symbol (optional, narrows search)',\n        },\n        file_path: {\n          type: 'string',\n          description: 'File path to disambiguate (optional)',\n        },\n        max_depth: {\n          type: 'number',\n          description: 'Max traversal depth for transitive dependants (default: 5)',\n          default: 5,\n        },\n        include_transitive: {\n          type: 'boolean',\n          description: 'Include transitive (indirect) dependants (default: true)',\n          default: true,\n        },\n      },\n      required: ['symbol_name'],\n    },\n    handler: async (params: any) => {\n      const q = createQuery(store, repoPath)\n        .symbol(params.symbol_name);\n\n      if (params.symbol_kind) q.eq('kind', params.symbol_kind);\n      if (params.file_path) q.eq('file_path', params.file_path);\n\n      const symbols = await q.toArray();\n      if (symbols.length === 0) {\n        return {\n          content: [{ type: 'text', text: JSON.stringify({ error: 'Symbol not found', symbol_name: params.symbol_name }) }],\n        };\n      }\n\n      const symbol = symbols[0];\n      const depth = params.max_depth ?? 5;\n\n      // Get dependants via graph traversal\n      const result = await store.graphTraversal(\n        symbol.id,\n        ['calls', 'imports', 'references', 'implements'],\n        'inbound',\n        depth,\n        undefined\n      );\n\n      // Organize by distance (direct vs transitive)\n      const direct = result.edges.filter(e => {\n        // Direct edges are those where the target is our symbol\n        return e.to === symbol.id;\n      }).map(e => result.nodes.find(n => n.id === e.from)!).filter(Boolean);\n\n      const transitive = result.nodes.filter(n => \n        n.id !== symbol.id && !direct.find(d => d.id === n.id)\n      );\n\n      // Group by file and module\n      const byFile = new Map<string, GraphNode[]>();\n      const byModule = new Map<string, GraphNode[]>();\n      \n      for (const node of result.nodes) {\n        const n = node as any;\n        if (n.file_path) {\n          if (!byFile.has(n.file_path)) byFile.set(n.file_path, []);\n          byFile.get(n.file_path)!.push(node);\n        }\n        if (n.module_id) {\n          if (!byModule.has(n.module_id)) byModule.set(n.module_id, []);\n          byModule.get(n.module_id)!.push(node);\n        }\n      }\n\n      const response = {\n        target: {\n          id: symbol.id,\n          name: (symbol as any).name,\n          kind: (symbol as any).kind,\n          file: (symbol as any).file_path,\n          line: (symbol as any).start_line,\n        },\n        impact_summary: {\n          total_dependants: result.nodes.length,\n          direct_dependants: direct.length,\n          transitive_dependants: transitive.length,\n          files_affected: byFile.size,\n          modules_affected: byModule.size,\n        },\n        direct_dependants: direct.map(n => ({\n          name: (n as any).name,\n          kind: (n as any).kind,\n          file: (n as any).file_path,\n          line: (n as any).start_line,\n          relationship: result.edges.find(e => e.from === n.id && e.to === symbol.id)?.type,\n        })),\n        affected_files: Object.fromEntries(\n          Array.from(byFile.entries()).map(([path, nodes]) => [\n            path,\n            nodes.map(n => ({ name: (n as any).name, kind: (n as any).kind, line: (n as any).start_line }))\n          ])\n        ),\n        affected_modules: Object.fromEntries(\n          Array.from(byModule.entries()).map(([id, nodes]) => [\n            id,\n            { symbol_count: nodes.length, kinds: [...new Set(nodes.map(n => (n as any).kind))] }\n          ])\n        ),\n        token_estimate: budget.estimate(JSON.stringify(result)),\n      };\n\n      // Apply token budget truncation if needed\n      const truncated = budget.truncate(response, params.max_tokens);\n\n      return {\n        content: [{ type: 'text', text: JSON.stringify(truncated, null, 2) }],\n      };\n    },\n  };\n}\n```\n\n### 2.10 Git Hook Implementation\n\n```typescript\n// src/hooks/pre-commit.ts\n\nimport { simpleGit, SimpleGit } from 'simple-git';\nimport { IStore } from '../storage/interface';\nimport { ExtractorRegistry } from '../extractor/registry';\nimport { GraphDiffer } from '../engine/differ';\nimport { GraphMerger } from '../engine/merger';\nimport { Validator } from '../engine/validator';\nimport { contentHash } from '../utils/hash';\nimport { Logger } from '../utils/logger';\n\ninterface PreCommitResult {\n  status: 'pass' | 'warn' | 'fail';\n  parsed: number;\n  updated: number;\n  added: number;\n  removed: number;\n  errors: string[];\n  warnings: string[];\n}\n\nexport async function runPreCommit(\n  repoPath: string,\n  store: IStore,\n  config: { mode: 'warn' | 'block' | 'off' },\n  logger: Logger\n): Promise<PreCommitResult> {\n  const result: PreCommitResult = {\n    status: 'pass',\n    parsed: 0,\n    updated: 0,\n    added: 0,\n    removed: 0,\n    errors: [],\n    warnings: [],\n  };\n\n  const git: SimpleGit = simpleGit(repoPath);\n\n  // 1. Get staged files\n  const stagedFiles = await git.diff(['--cached', '--name-only', '--diff-filter=ACMR']);\n  const fileNames = stagedFiles.trim().split('\\n').filter(Boolean);\n\n  if (fileNames.length === 0) {\n    return result;\n  }\n\n  logger.info(`Pre-commit: ${fileNames.length} staged files`);\n\n  // 2. Filter to supported files\n  const registry = new ExtractorRegistry();\n  const supportedFiles = fileNames.filter(f => registry.supportsFile(f));\n\n  if (supportedFiles.length === 0) {\n    return result;\n  }\n\n  logger.info(`Pre-commit: ${supportedFiles.length} supported files to parse`);\n\n  // 3. Parse changed files\n  for (const filePath of supportedFiles) {\n    try {\n      const absolutePath = path.resolve(repoPath, filePath);\n      const content = await fs.readFile(absolutePath, 'utf-8');\n      const hash = contentHash(content);\n\n      // Check if content actually changed\n      const existingFile = await store.query(\n        'SELECT content_hash FROM file WHERE path = $path LIMIT 1',\n        { path: filePath }\n      );\n\n      if (existingFile.length > 0 && existingFile[0].content_hash === hash) {\n        continue; // No change\n      }\n\n      // Extract symbols\n      const extractor = registry.getExtractor(filePath);\n      const extraction = await extractor.extractFile(absolutePath, repoPath);\n\n      // Diff against existing graph\n      const oldSymbols = await store.query(\n        'SELECT * FROM symbol WHERE file_id = $fileId',\n        { fileId: `file:${filePath}` }\n      );\n\n      const diff = GraphDiffer.diff(oldSymbols, extraction.symbols);\n\n      // Merge into graph\n      await store.transaction(async (tx) => {\n        // Remove old symbols\n        for (const removed of diff.removed) {\n          await tx.deleteNode(removed.id);\n          await tx.deleteEdges(removed.id);\n          result.removed++;\n        }\n\n        // Update changed symbols\n        for (const changed of diff.changed) {\n          await tx.updateNode(changed.new.id, changed.new);\n          result.updated++;\n        }\n\n        // Add new symbols\n        for (const added of diff.added) {\n          await tx.createNode(added);\n          result.added++;\n        }\n\n        // Update edges\n        await tx.deleteEdges(`file:${filePath}`); // Remove old edges from this file\n        await tx.createEdges(extraction.edges.map(e => ({\n          ...e,\n          // Resolve file-level edges\n          from: e.from.startsWith('file:') ? `file:${filePath}` : e.from,\n        })));\n\n        // Update file node\n        const fileNode = {\n          id: `file:${filePath}`,\n          type: 'file',\n          path: filePath,\n          content_hash: hash,\n          parse_status: extraction.parseErrors.length === 0 ? 'parsed' : 'partial',\n          parse_error: extraction.parseErrors.length > 0 \n            ? extraction.parseErrors.map(e => `L${e.line}: ${e.message}`).join('; ') \n            : null,\n          last_parsed: new Date().toISOString(),\n          line_count: content.split('\\n').length,\n          size_bytes: Buffer.byteLength(content),\n        };\n        await tx.createNode(fileNode as any);\n      });\n\n      result.parsed++;\n\n      if (extraction.parseErrors.length > 0) {\n        result.warnings.push(\n          `${filePath}: ${extraction.parseErrors.length} parse errors`\n        );\n      }\n    } catch (err) {\n      result.errors.push(`${filePath}: ${err.message}`);\n      logger.error(`Pre-commit error for ${filePath}`, err);\n    }\n  }\n\n  // 4. Validate (if enabled)\n  if (config.mode !== 'off') {\n    const validation = await Validator.validate(store, repoPath);\n    result.warnings.push(...validation.warnings);\n    result.errors.push(...validation.errors);\n\n    if (result.errors.length > 0 && config.mode === 'block') {\n      result.status = 'fail';\n    } else if (result.warnings.length > 0 || result.errors.length > 0) {\n      result.status = 'warn';\n    }\n  }\n\n  // 5. Update repo stats\n  await updateRepoStats(store, repoPath);\n\n  return result;\n}\n```\n\n### 2.11 Workflow Template: Bug Fix\n\n```typescript\n// src/workflows/templates/bug-fix.ts\n\nimport { IStore } from '../../storage/interface';\nimport { createQuery } from '../../query/builder';\n\nexport interface BugFixInput {\n  error_message?: string;\n  stack_trace?: string[];\n  file_path?: string;\n  line_number?: number;\n  symbol_name?: string;\n  error_type?: string; // TypeError, ReferenceError, etc.\n}\n\nexport interface BugFixOutput {\n  root_candidates: RootCandidate[];\n  impact_radius: ImpactRadius;\n  related_tests: RelatedTest[];\n  recent_changes: RecentChange[];\n  suggested_investigation_order: string[];\n}\n\ninterface RootCandidate {\n  symbol_id: string;\n  symbol_name: string;\n  kind: string;\n  file_path: string;\n  line: number;\n  confidence: 'high' | 'medium' | 'low';\n  reason: string;\n}\n\ninterface ImpactRadius {\n  direct_callers: number;\n  transitive_callers: number;\n  affected_files: string[];\n  affected_modules: string[];\n}\n\nexport async function executeBugFixWorkflow(\n  store: IStore,\n  repoPath: string,\n  input: BugFixInput\n): Promise<BugFixOutput> {\n  const candidates: RootCandidate[] = [];\n\n  // Strategy 1: If we have a file + line, look up the symbol at that location\n  if (input.file_path && input.line_number) {\n    const symbols = await createQuery(store, repoPath)\n      .symbol('') // We need a different query here\n      .eq('file_path', input.file_path)\n      .toArray();\n\n    // Find symbol containing the line\n    const containing = symbols.find(s => {\n      const sym = s as any;\n      return sym.start_line <= input.line_number! && sym.end_line >= input.line_number!;\n    });\n\n    if (containing) {\n      candidates.push({\n        symbol_id: containing.id,\n        symbol_name: (containing as any).name,\n        kind: (containing as any).kind,\n        file_path: (containing as any).file_path,\n        line: (containing as any).start_line,\n        confidence: 'high',\n        reason: `Symbol at error location (${input.file_path}:${input.line_number})`,\n      });\n    }\n  }\n\n  // Strategy 2: If we have a symbol name from the error (e.g., \"Cannot read property 'foo' of undefined\")\n  if (input.symbol_name || input.error_message) {\n    const nameToSearch = input.symbol_name || extractPropertyName(input.error_message!);\n    if (nameToSearch) {\n      const matches = await createQuery(store, repoPath)\n        .symbol(nameToSearch)\n        .toArray();\n\n      for (const match of matches) {\n        // Don't duplicate if already found\n        if (candidates.find(c => c.symbol_id === match.id)) continue;\n\n        candidates.push({\n          symbol_id: match.id,\n          symbol_name: (match as any).name,\n          kind: (match as any).kind,\n          file_path: (match as any).file_path,\n          line: (match as any).start_line,\n          confidence: 'medium',\n          reason: `Name matches error reference: \"${nameToSearch}\"`,\n        });\n      }\n    }\n  }\n\n  // Strategy 3: If we have a stack trace, trace the call chain\n  if (input.stack_trace && input.stack_trace.length > 0) {\n    for (const frame of input.stack_trace) {\n      const parsed = parseStackFrame(frame);\n      if (!parsed) continue;\n\n      const symbols = await createQuery(store, repoPath)\n        .symbol(parsed.functionName)\n        .eq('file_path', parsed.filePath)\n        .toArray();\n\n      for (const sym of symbols) {\n        if (candidates.find(c => c.symbol_id === sym.id)) continue;\n        candidates.push({\n          symbol_id: sym.id,\n          symbol_name: (sym as any).name,\n          kind: (sym as any).kind,\n          file_path: (sym as any).file_path,\n          line: (sym as any).start_line,\n          confidence: parsed.filePath === input.file_path ? 'high' : 'medium',\n          reason: `Appears in stack trace: ${frame.trim()}`,\n        });\n      }\n    }\n  }\n\n  // Strategy 4: If error type suggests null/undefined, find recently changed symbols in the area\n  if (input.error_type && ['TypeError', 'ReferenceError'].includes(input.error_type)) {\n    // Find symbols modified in last 5 commits in the same file\n    if (input.file_path) {\n      const recentSymbols = await store.query(`\n        SELECT symbol.*, commit.hash, commit.date\n        FROM symbol\n        INNER JOIN modified_in ON symbol.file_id = modified_in.from\n        INNER JOIN commit ON modified_in.to = commit.id\n        WHERE symbol.file_path = $filePath\n        ORDER BY commit.date DESC\n        LIMIT 10\n      `, { filePath: input.file_path });\n\n      for (const rs of recentSymbols) {\n        if (candidates.find(c => c.symbol_id === rs.id)) continue;\n        candidates.push({\n          symbol_id: rs.id,\n          symbol_name: rs.name,\n          kind: rs.kind,\n          file_path: rs.file_path,\n          line: rs.start_line,\n          confidence: 'low',\n          reason: `Recently modified symbol in error file (commit ${rs.hash})`,\n        });\n      }\n    }\n  }\n\n  // Compute impact radius for top candidate\n  let impactRadius: ImpactRadius = {\n    direct_callers: 0,\n    transitive_callers: 0,\n    affected_files: [],\n    affected_modules: [],\n  };\n\n  if (candidates.length > 0) {\n    const topCandidate = candidates[0];\n    const result = await store.graphTraversal(\n      topCandidate.symbol_id,\n      ['calls', 'imports'],\n      'inbound',\n      10,\n      undefined\n    );\n    \n    const directEdges = result.edges.filter(e => e.to === topCandidate.symbol_id);\n    impactRadius.direct_callers = directEdges.length;\n    impactRadius.transitive_callers = result.nodes.length;\n    impactRadius.affected_files = [...new Set(result.nodes.map(n => (n as any).file_path).filter(Boolean))];\n    \n    // Resolve modules\n    for (const filePath of impactRadius.affected_files) {\n      const fileNode = await store.query(\n        'SELECT module_id FROM file WHERE path = $path LIMIT 1',\n        { path: filePath }\n      );\n      if (fileNode.length > 0 && fileNode[0].module_id) {\n        impactRadius.affected_modules.push(fileNode[0].module_id);\n      }\n    }\n    impactRadius.affected_modules = [...new Set(impactRadius.affected_modules)];\n  }\n\n  // Find related tests\n  const relatedTests: RelatedTest[] = [];\n  if (candidates.length > 0) {\n    for (const candidate of candidates.slice(0, 3)) {\n      const testSymbols = await store.query(`\n        SELECT * FROM symbol\n        WHERE name CONTAINS $testName\n          AND (kind = 'function' AND name LIKE '%test%')\n        LIMIT 5\n      `, { testName: candidate.symbol_name });\n\n      for (const test of testSymbols) {\n        relatedTests.push({\n          test_name: test.name,\n          file_path: test.file_path,\n          line: test.start_line,\n          linked_to: candidate.symbol_name,\n        });\n      }\n    }\n  }\n\n  // Suggest investigation order\n  const suggestedOrder = candidates\n    .sort((a, b) => {\n      const confOrder = { high: 0, medium: 1, low: 2 };\n      return confOrder[a.confidence] - confOrder[b.confidence];\n    })\n    .map(c => `${c.file_path}:${c.line} (${c.symbol_name})`);\n\n  return {\n    root_candidates: candidates,\n    impact_radius: impactRadius,\n    related_tests: relatedTests,\n    recent_changes: [], // Populated from git log\n    suggested_investigation_order: suggestedOrder,\n  };\n}\n\nfunction extractPropertyName(errorMessage: string): string | null {\n  // \"Cannot read properties of undefined (reading 'foo')\"\n  const readMatch = errorMessage.match(/reading '(\\w+)'/);\n  if (readMatch) return readMatch[1];\n  \n  // \"foo is not a function\"\n  const notFnMatch = errorMessage.match(/(\\w+) is not a function/);\n  if (notFnMatch) return notFnMatch[1];\n  \n  // \"foo is not defined\"\n  const notDefMatch = errorMessage.match(/(\\w+) is not defined/);\n  if (notDefMatch) return notDefMatch[1];\n\n  return null;\n}\n\nfunction parseStackFrame(frame: string): { functionName: string; filePath: string } | null {\n  // \"at functionName (/path/to/file.ts:10:5)\"\n  const match = frame.match(/at\\s+(\\w+)\\s+\\((.+):(\\d+):\\d+\\)/);\n  if (!match) return null;\n  return { functionName: match[1], filePath: match[2] };\n}\n```\n\n### 2.12 Token Budget Manager\n\n```typescript\n// src/mcp/token-budget.ts\n\nexport class TokenBudgetManager {\n  private maxTokens: number;\n\n  // Approximate tokens per character for different content types\n  private static RATES = {\n    code: 0.25,       // ~4 chars per token\n    markdown: 0.3,    // ~3.3 chars per token\n    json: 0.22,       // ~4.5 chars per token (compact)\n    text: 0.33,       // ~3 chars per token\n  };\n\n  constructor(maxTokens: number = 8000) {\n    this.maxTokens = maxTokens;\n  }\n\n  estimate(content: string, type: keyof typeof TokenBudgetManager.RATES = 'json'): number {\n    return Math.ceil(content.length * TokenBudgetManager.RATES[type]);\n  }\n\n  truncate<T>(data: T, requestedMax?: number): T & { _truncated: boolean; _token_count: number } {\n    const max = requestedMax ?? this.maxTokens;\n    const json = JSON.stringify(data);\n    const tokens = this.estimate(json);\n\n    if (tokens <= max) {\n      return {\n        ...data,\n        _truncated: false,\n        _token_count: tokens,\n      } as T & { _truncated: boolean; _token_count: number };\n    }\n\n    // Truncation strategy: keep structure, reduce detail\n    const truncated = this.smartTruncate(data, max);\n    const truncatedJson = JSON.stringify(truncated);\n    const truncatedTokens = this.estimate(truncatedJson);\n\n    return {\n      ...truncated,\n      _truncated: true,\n      _token_count: truncatedTokens,\n    } as T & { _truncated: boolean; _token_count: number };\n  }\n\n  private smartTruncate<T>(data: T, budget: number): T {\n    const obj = data as any;\n\n    // Strategy 1: If it has an array of items, truncate the array\n    for (const key of Object.keys(obj)) {\n      if (Array.isArray(obj[key]) && obj[key].length > 0) {\n        // Keep reducing until we're under budget\n        let len = obj[key].length;\n        while (len > 1) {\n          const testObj = { ...obj, [key]: obj[key].slice(0, len) };\n          const testJson = JSON.stringify(testObj);\n          if (this.estimate(testJson) <= budget * 0.9) { // 10% margin for metadata\n            obj[key] = obj[key].slice(0, len);\n            obj._truncation_note = `${key} truncated from ${obj[key].length} to ${len} items`;\n            return obj as T;\n          }\n          len = Math.floor(len * 0.7); // Reduce by 30% each iteration\n        }\n        obj[key] = obj[key].slice(0, 1);\n        return obj as T;\n      }\n    }\n\n    // Strategy 2: Remove verbose fields\n    const verboseFields = ['signature', 'docstring', 'metadata', 'raw'];\n    for (const field of verboseFields) {\n      if (obj[field]) {\n        delete obj[field];\n        const testJson = JSON.stringify(obj);\n        if (this.estimate(testJson) <= budget * 0.9) {\n          return obj as T;\n        }\n      }\n    }\n\n    // Strategy 3: Last resort - truncate string fields\n    for (const key of Object.keys(obj)) {\n      if (typeof obj[key] === 'string' && obj[key].length > 100) {\n        obj[key] = obj[key].slice(0, 100) + '...';\n      }\n    }\n\n    return obj as T;\n  }\n}\n```\n\n### 2.13 SurrealDB Schema Migration\n\n```typescript\n// src/storage/surreal/migrations.ts\n\nexport const SCHEMA_DEFINITION = `\n// ============================================\n// TOKENZIP GRAPH SCHEMA - SurrealDB v2\n// ============================================\n\n// --- NODE TYPES ---\n\nDEFINE TABLE repository SCHEMAFULL;\nDEFINE FIELD name ON repository TYPE string;\nDEFINE FIELD root ON repository TYPE string;\nDEFINE FIELD created_at ON repository TYPE datetime DEFAULT time::now();\nDEFINE FIELD updated_at ON repository TYPE datetime DEFAULT time::now();\nDEFINE FIELD stats ON repository TYPE object {\n  files: number,\n  modules: number, \n  symbols: number\n};\n\nDEFINE TABLE module SCHEMAFULL;\nDEFINE FIELD name ON module TYPE string;\nDEFINE FIELD path ON module TYPE string;\nDEFINE FIELD manifest_type ON module TYPE string;\nDEFINE FIELD language ON module TYPE string;\nDEFINE FIELD is_root ON module TYPE bool DEFAULT false;\nDEFINE FIELD metadata ON module TYPE object;\nDEFINE FIELD repository_id ON module TYPE record<repository>;\n\nDEFINE TABLE file SCHEMAFULL;\nDEFINE FIELD path ON file TYPE string;\nDEFINE FIELD module_id ON file TYPE record<module>;\nDEFINE FIELD language ON file TYPE string;\nDEFINE FIELD ext ON file TYPE string;\nDEFINE FIELD size_bytes ON file TYPE int;\nDEFINE FIELD content_hash ON file TYPE string;\nDEFINE FIELD line_count ON file TYPE int;\nDEFINE FIELD parse_status ON file TYPE string \n  ASSERT $value IN ['parsed', 'partial', 'failed', 'skipped'];\nDEFINE FIELD parse_error ON file TYPE option<string>;\nDEFINE FIELD last_parsed ON file TYPE datetime;\nDEFINE FIELD git_last_modified ON file TYPE option<datetime>;\nDEFINE FIELD git_blame_summary ON file TYPE option<object>;\n\nDEFINE TABLE symbol SCHEMAFULL;\nDEFINE FIELD file_id ON symbol TYPE record<file>;\nDEFINE FIELD name ON symbol TYPE string;\nDEFINE FIELD kind ON symbol TYPE string \n  ASSERT $value IN [\n    'function', 'method', 'constructor',\n    'class', 'interface', 'type_alias', 'enum',\n    'variable', 'constant', 'property',\n    'parameter', 'generic_param',\n    'decorator', 'annotation',\n    'table', 'view', 'column', 'index', 'constraint',\n    'foreign_key', 'stored_procedure',\n    'import', 'export', 're_export',\n    'namespace', 'module_decl',\n    'section', 'subsection',\n    'workflow_step', 'diagram_node',\n    'list_item', 'table_row'\n  ];\nDEFINE FIELD signature ON symbol TYPE option<string>;\nDEFINE FIELD return_type ON symbol TYPE option<string>;\nDEFINE FIELD start_line ON symbol TYPE int;\nDEFINE FIELD end_line ON symbol TYPE int;\nDEFINE FIELD start_col ON symbol TYPE int;\nDEFINE FIELD end_col ON symbol TYPE int;\nDEFINE FIELD docstring ON symbol TYPE option<string>;\nDEFINE FIELD is_exported ON symbol TYPE bool DEFAULT false;\nDEFINE FIELD is_async ON symbol TYPE option<bool>;\nDEFINE FIELD is_static ON symbol TYPE option<bool>;\nDEFINE FIELD visibility ON symbol TYPE option<string>\n  ASSERT $value IN [null, 'public', 'private', 'protected'];\nDEFINE FIELD modifiers ON symbol TYPE array;\nDEFINE FIELD parent_symbol_id ON symbol TYPE option<string>;\nDEFINE FIELD metadata ON symbol TYPE object;\n\nDEFINE TABLE commit SCHEMAFULL;\nDEFINE FIELD hash ON commit TYPE string;\nDEFINE FIELD short_hash ON commit TYPE string;\nDEFINE FIELD message ON commit TYPE string;\nDEFINE FIELD author ON commit TYPE string;\nDEFINE FIELD email ON commit TYPE string;\nDEFINE FIELD date ON commit TYPE datetime;\nDEFINE FIELD branch ON commit TYPE string;\nDEFINE FIELD tags ON commit TYPE array;\n\nDEFINE TABLE dependency SCHEMAFULL;\nDEFINE FIELD module_id ON dependency TYPE record<module>;\nDEFINE FIELD name ON dependency TYPE string;\nDEFINE FIELD version ON dependency TYPE string;\nDEFINE FIELD dev ON dependency TYPE bool DEFAULT false;\nDEFINE FIELD source ON dependency TYPE string;\n\n// --- EDGE TYPES ---\n\nDEFINE TABLE contains SCHEMAFULL TYPE RELATION FROM repository, module, file, symbol TO module, file, symbol;\nDEFINE TABLE imports SCHEMAFULL TYPE RELATION FROM file, symbol, module TO file, symbol, module;\nDEFINE FIELD is_type_only ON imports TYPE option<bool>;\nDEFINE FIELD is_default ON imports TYPE option<bool>;\nDEFINE FIELD alias ON imports TYPE option<string>;\nDEFINE FIELD specifiers ON imports TYPE option<array>;\n\nDEFINE TABLE exports SCHEMAFULL TYPE RELATION FROM file, symbol TO symbol, file;\nDEFINE FIELD is_default ON exports TYPE option<bool>;\nDEFINE FIELD is_reexport ON exports TYPE option<bool>;\nDEFINE FIELD alias ON exports TYPE option<string>;\nDEFINE FIELD name ON exports TYPE option<string>;\n\nDEFINE TABLE calls SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD line ON calls TYPE option<int>;\nDEFINE FIELD is_async ON calls TYPE option<bool>;\nDEFINE FIELD call_type ON calls TYPE option<string>\n  ASSERT $value IN [null, 'direct', 'indirect', 'dynamic'];\n\nDEFINE TABLE implements SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD is_partial ON implements TYPE option<bool>;\n\nDEFINE TABLE inherits SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD is_interface_inheritance ON inherits TYPE option<bool>;\n\nDEFINE TABLE modifies SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE TABLE reads SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE TABLE references SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD context ON references TYPE option<string>;\n\nDEFINE TABLE depends_on SCHEMAFULL TYPE RELATION FROM module, file TO module, file;\nDEFINE FIELD is_transitive ON depends_on TYPE option<bool>;\nDEFINE FIELD depth ON depends_on TYPE option<int>;\n\nDEFINE TABLE modified_in SCHEMAFULL TYPE RELATION FROM file TO commit;\nDEFINE FIELD change_type ON modified_in TYPE string\n  ASSERT $value IN ['added', 'modified', 'deleted', 'renamed'];\n\nDEFINE TABLE foreign_key SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD constraint_name ON foreign_key TYPE option<string>;\nDEFINE FIELD on_delete ON foreign_key TYPE option<string>;\nDEFINE FIELD on_update ON foreign_key TYPE option<string>;\nDEFINE FIELD ref_column ON foreign_key TYPE option<string>;\n\nDEFINE TABLE column_of SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\n\nDEFINE TABLE diagram_edge SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD label ON diagram_edge TYPE option<string>;\nDEFINE FIELD style ON diagram_edge TYPE option<string>;\nDEFINE FIELD type ON diagram_edge TYPE option<string>;\nDEFINE FIELD sequence ON diagram_edge TYPE option<int>;\nDEFINE FIELD is_response ON diagram_edge TYPE option<bool>;\n\nDEFINE TABLE workflow_transition SCHEMAFULL TYPE RELATION FROM symbol TO symbol;\nDEFINE FIELD condition ON workflow_transition TYPE option<string>;\nDEFINE FIELD action ON workflow_transition TYPE option<string>;\n\n// --- INDEXES ---\n\nDEFINE INDEX idx_file_path ON file FIELDS path UNIQUE;\nDEFINE INDEX idx_file_hash ON file FIELDS content_hash;\nDEFINE INDEX idx_file_module ON file FIELDS module_id;\nDEFINE INDEX idx_symbol_name ON symbol FIELDS name;\nDEFINE INDEX idx_symbol_kind ON symbol FIELDS kind;\nDEFINE INDEX idx_symbol_file ON symbol FIELDS file_id;\nDEFINE INDEX idx_symbol_export ON symbol FIELDS is_exported;\nDEFINE INDEX idx_module_path ON module FIELDS path UNIQUE;\nDEFINE INDEX idx_commit_hash ON commit FIELDS hash UNIQUE;\nDEFINE INDEX idx_dep_name ON dependency FIELDS name, module_id;\n`;\n```\n\n### 2.14 Error Handling Strategy\n\n```typescript\n// src/utils/errors.ts\n\nexport class TokenZipError extends Error {\n  constructor(\n    message: string,\n    public readonly code: ErrorCode,\n    public readonly details?: Record<string, unknown>\n  ) {\n    super(message);\n    this.name = 'TokenZipError';\n  }\n}\n\nexport enum ErrorCode {\n  // Storage errors (1xxx)\n  DB_CONNECTION_FAILED = 'E1001',\n  DB_QUERY_FAILED = 'E1002',\n  DB_MIGRATION_FAILED = 'E1003',\n  DB_CORRUPTED = 'E1004',\n\n  // Parser errors (2xxx)\n  PARSE_FAILED = 'E2001',\n  GRAMMAR_NOT_FOUND = 'E2002',\n  PARTIAL_PARSE = 'E2003',\n\n  // Git errors (3xxx)\n  GIT_NOT_REPOSITORY = 'E3001',\n  GIT_HOOK_INSTALL_FAILED = 'E3002',\n  GIT_DIFF_FAILED = 'E3003',\n\n  // MCP errors (4xxx)\n  MCP_TRANSPORT_FAILED = 'E4001',\n  MCP_TOOL_NOT_FOUND = 'E4002',\n  MCP_INVALID_PARAMS = 'E4003',\n  MCP_TOKEN_BUDGET_EXCEEDED = 'E4004',\n\n  // Config errors (5xxx)\n  CONFIG_NOT_FOUND = 'E5001',\n  CONFIG_INVALID = 'E5002',\n\n  // Indexer errors (6xxx)\n  INDEX_INTERRUPTED = 'E6001',\n  INDEX_FILE_TOO_LARGE = 'E6002',\n  INDEX_BINARY_FILE = 'E6003',\n}\n\n// Global error handler for MCP tools\nexport function mcpErrorHandler(error: unknown): { content: Array<{ type: 'text'; text: string }>; isError: boolean } {\n  if (error instanceof TokenZipError) {\n    return {\n      content: [{\n        type: 'text',\n        text: JSON.stringify({\n          error: error.message,\n          code: error.code,\n          details: error.details,\n        }),\n      }],\n      isError: true,\n    };\n  }\n\n  if (error instanceof Error) {\n    return {\n      content: [{\n        type: 'text',\n        text: JSON.stringify({\n          error: error.message,\n          code: 'E9999',\n          stack: process.env.NODE_ENV === 'development' ? error.stack : undefined,\n        }),\n      }],\n      isError: true,\n    };\n  }\n\n  return {\n    content: [{ type: 'text', text: JSON.stringify({ error: 'Unknown error' }) }],\n    isError: true,\n  };\n}\n```\n\n### 2.15 Testing Strategy\n\n```typescript\n// tests/unit/extractor/typescript.test.ts\n\nimport { describe, it, expect, beforeEach } from 'vitest';\nimport { TypeScriptExtractor } from '../../../src/extractor/code/typescript';\nimport { createMockContext } from '../../helpers';\n\ndescribe('TypeScriptExtractor', () => {\n  let extractor: TypeScriptExtractor;\n\n  beforeEach(() => {\n    extractor = new TypeScriptExtractor();\n  });\n\n  describe('function extraction', () => {\n    it('extracts a simple exported function', () => {\n      const code = `\nexport function addUser(name: string, age: number): User {\n  return { name, age, id: crypto.randomUUID() };\n}\n`;\n      const ctx = createMockContext('src/user.ts', code, 'module-1');\n      const result = extractor.extract(ctx);\n\n      expect(result.symbols).toHaveLength(1);\n      expect(result.symbols[0]).toMatchObject({\n        name: 'addUser',\n        kind: 'function',\n        isExported: true,\n        isAsync: false,\n        startLine: 2,\n        endLine: 4,\n      });\n      expect(result.symbols[0].metadata.params).toEqual([\n        { name: 'name', type: 'string' },\n        { name: 'age', type: 'number' },\n      ]);\n      expect(result.symbols[0].returnType).toBe('User');\n    });\n\n    it('extracts async arrow function assigned to const', () => {\n      const code = `\nexport const fetchUser = async (id: string): Promise<User> => {\n  const res = await fetch(\\`/api/users/\\${id}\\`);\n  return res.json();\n};\n`;\n      const ctx = createMockContext('src/api.ts', code, 'module-1');\n      const result = extractor.extract(ctx);\n\n      expect(result.symbols).toHaveLength(1);\n      expect(result.symbols[0]).toMatchObject({\n        name: 'fetchUser',\n        kind: 'function',\n        isExported: true,\n        isAsync: true,\n      });\n      expect(result.symbols[0].metadata.isArrow).toBe(true);\n    });\n\n    it('extracts class with methods, inheritance, and implementation', () => {\n      const code = `\nexport class UserRepository implements IRepository<User> {\n  private cache: Map<string, User> = new Map();\n\n  async findById(id: string): Promise<User | null> {\n    return this.cache.get(id) ?? null;\n  }\n\n  async save(user: User): Promise<void> {\n    this.cache.set(user.id, user);\n  }\n}\n`;\n      const ctx = createMockContext('src/repo.ts', code, 'module-1');\n      const result = extractor.extract(ctx);\n\n      // 1 class + 1 property + 2 methods\n      expect(result.symbols).toHaveLength(4);\n      \n      const classSym = result.symbols.find(s => s.kind === 'class')!;\n      expect(classSym.name).toBe('UserRepository');\n      expect(classSym.isExported).toBe(true);\n      expect(classSym.metadata.implements).toEqual(['IRepository<User>']);\n\n      const methods = result.symbols.filter(s => s.kind === 'method');\n      expect(methods).toHaveLength(2);\n      expect(methods.map(m => m.name)).toEqual(['findById', 'save']);\n\n      // Check implements edge\n      const implEdge = result.edges.find(e => e.type === 'implements');\n      expect(implEdge).toBeDefined();\n    });\n\n    it('extracts interface with generics and members', () => {\n      const code = `\nexport interface IRepository<T extends { id: string }> {\n  findById(id: string): Promise<T | null>;\n  save(entity: T): Promise<void>;\n  delete(id: string): Promise<boolean>;\n}\n`;\n      const ctx = createMockContext('src/types.ts', code, 'module-1');\n      const result = extractor.extract(ctx);\n\n      expect(result.symbols).toHaveLength(1);\n      expect(result.symbols[0]).toMatchObject({\n        name: 'IRepository',\n        kind: 'interface',\n        isExported: true,\n      });\n      expect(result.symbols[0].metadata.generics).toEqual(['T extends { id: string }']);\n      expect(result.symbols[0].metadata.members).toHaveLength(3);\n    });\n\n    it('extracts imports with type-only and default', () => {\n      const code = `\nimport type { User } from './types';\nimport React, { useState, useEffect } from 'react';\nimport { formatDate } from './utils';\n`;\n      const ctx = createMockContext('src/component.tsx', code, 'module-1');\n      const result = extractor.extract(ctx);\n\n      const imports = result.symbols.filter(s => s.kind === 'import');\n      expect(imports).toHaveLength(3);\n      \n      expect(imports[0].metadata.isTypeOnly).toBe(true);\n      expect(imports[0].metadata.source).toBe('./types');\n      \n      expect(imports[1].metadata.isDefault).toBe(true);\n      expect(imports[1].metadata.source).toBe('react');\n      expect(imports[1].metadata.specifiers).toContain('useState');\n    });\n\n    it('handles parse errors gracefully', () => {\n      const code = `\nexport function broken(\n  // Missing closing paren and body\n`;\n      const ctx = createMockContext('src/broken.ts', code, 'module-1');\n      const result = extractor.extract(ctx);\n\n      expect(result.parseErrors.length).toBeGreaterThan(0);\n      // Should still return partial results if any\n      expect(result.symbols).toBeDefined();\n    });\n  });\n});\n```\n\n```typescript\n// tests/integration/full-parse.test.ts\n\nimport { describe, it, expect, beforeAll, afterAll } from 'vitest';\nimport { MemoryStore } from '../../src/storage/memory/store';\nimport { Indexer } from '../../src/engine/indexer';\nimport { createQuery } from '../../src/query/builder';\nimport path from 'path';\n\ndescribe('Full Parse Integration', () => {\n  let store: MemoryStore;\n  let indexer: Indexer;\n  const fixturePath = path.join(__dirname, '../fixtures/ts-monorepo');\n\n  beforeAll(async () => {\n    store = new MemoryStore();\n    await store.initialize();\n    await store.migrate();\n    indexer = new Indexer(store, fixturePath);\n    await indexer.fullIndex();\n  });\n\n  afterAll(async () => {\n    await store.close();\n  });\n\n  it('indexes all modules in the monorepo', async () => {\n    const modules = await createQuery(store, fixturePath).modules().toArray();\n    expect(modules.length).toBeGreaterThanOrEqual(3); // apps/web, apps/api, packages/shared\n  });\n\n  it('extracts all TypeScript symbols', async () => {\n    const symbols = await createQuery(store, fixturePath)\n      .symbols()\n      .eq('kind', 'function')\n      .toArray();\n    expect(symbols.length).toBeGreaterThan(10);\n  });\n\n  it('resolves cross-module imports', async () => {\n    // Find a symbol in packages/shared that's imported by apps/web\n    const sharedExports = await createQuery(store, fixturePath)\n      .modules()\n      .eq('path', 'packages/shared')\n      .files()\n      .symbols()\n      .eq('is_exported', true)\n      .toArray();\n\n    expect(sharedExports.length).toBeGreaterThan(0);\n\n    // Check that at least one has an imports edge from apps/web\n    const importEdges = await store.getEdgesTo(sharedExports[0].id, 'imports');\n    // At least the file-level import should exist\n  });\n\n  it('chainable query: modules → files → symbols → filters', async () => {\n    const result = await createQuery(store, fixturePath)\n      .modules()\n      .eq('language', 'typescript')\n      .files()\n      .eq('ext', '.ts')\n      .symbols()\n      .eq('kind', 'class')\n      .eq('is_exported', true)\n      .toArray();\n\n    expect(result.length).toBeGreaterThan(0);\n    for (const sym of result) {\n      expect((sym as any).kind).toBe('class');\n      expect((sym as any).is_exported).toBe(true);\n    }\n  });\n\n  it('graph traversal: find all callers of an exported function', async () => {\n    const targetFunc = await createQuery(store, fixturePath)\n      .symbol('formatDate')\n      .eq('kind', 'function')\n      .toArray();\n\n    if (targetFunc.length === 0) return; // Skip if fixture doesn't have this\n\n    const callers = await createQuery(store, fixturePath)\n      .symbol('formatDate')\n      .callers()\n      .toArray();\n\n    // Should find at least one caller\n    expect(callers.length).toBeGreaterThan(0);\n  });\n\n  it('formats query result as markdown', async () => {\n    const md = await createQuery(store, fixturePath)\n      .modules()\n      .limit(3)\n      .toMarkdown();\n\n    expect(md).toContain('#');\n    expect(md).toContain('packages/shared'); // Based on fixture\n  });\n});\n```\n\n### 2.16 Configuration Schema\n\n```typescript\n// src/types/config.ts\n\nexport interface TokenZipConfig {\n  // Project-level config (.tokenzip/config.json)\n  version: string;\n  \n  storage: {\n    engine: 'surrealdb' | 'sqlite' | 'auto';\n    path: string; // relative to project root, default: .tokenzip/db\n    surrealdb?: {\n      binary_path?: string; // custom surrealdb binary\n      memory?: boolean; // use memory backend instead of RocksDB\n    };\n  };\n\n  languages: {\n    enabled: string[]; // ['typescript', 'javascript', 'python', 'sql', 'markdown']\n    disabled: string[];\n    custom: Record<string, {\n      extensions: string[];\n      grammar_path?: string; // path to custom tree-sitter WASM\n      extractor_path?: string; // path to custom extractor JS\n    }>;\n  };\n\n  exclude: {\n    paths: string[]; // glob patterns: ['**/node_modules/**', '**/dist/**', '**/.git/**']\n    files: string[]; // exact filenames: ['package-lock.json', 'yarn.lock']\n    max_file_size_kb: number; // default: 500\n  };\n\n  hooks: {\n    pre_commit: 'warn' | 'block' | 'off';\n    post_commit: 'on' | 'off';\n    validate_on_commit: boolean; // run reference integrity checks\n  };\n\n  mcp: {\n    max_tokens: number; // default: 8000\n    transport: 'stdio' | 'sse';\n    port: number; // for SSE, default: 3777\n    include_source: boolean; // include source code in responses\n    source_max_lines: number; // max lines of source per symbol, default: 50\n  };\n\n  indexing: {\n    worker_threads: number; // default: os.cpus().length - 1, min 1\n    batch_size: number; // files per batch, default: 100\n    git_history_depth: number; // commits to index, default: 100\n  };\n\n  workflows: {\n    enabled: string[]; // ['create-module', 'update-module', 'implement-feature', 'upgrade-feature', 'bug-fix']\n  };\n}\n\nexport const DEFAULT_CONFIG: TokenZipConfig = {\n  version: '2.0.0',\n  storage: {\n    engine: 'auto',\n    path: '.tokenzip/db',\n  },\n  languages: {\n    enabled: ['typescript', 'javascript', 'python', 'sql', 'go', 'rust', 'java', 'kotlin', 'markdown'],\n    disabled: [],\n    custom: {},\n  },\n  exclude: {\n    paths: [\n      '**/node_modules/**',\n      '**", "url": "https://wpnews.pro/news/tokenzip-v2-prd-hld-lld", "canonical_source": "https://gist.github.com/devilankur18/ee2402e656fa4eaa076bdf2c79fcc6b8", "published_at": "2026-05-11 08:22:43+00:00", "updated_at": "2026-05-23 14:07:30.182701+00:00", "lang": "en", "topics": ["developer-tools", "large-language-models", "artificial-intelligence", "open-source", "products"], "entities": ["TokenZip", "Karpathy", "Claude Code", "Codex", "OpenCode", "Kilo Code", "MCP"], "alternates": {"html": "https://wpnews.pro/news/tokenzip-v2-prd-hld-lld", "markdown": "https://wpnews.pro/news/tokenzip-v2-prd-hld-lld.md", "text": "https://wpnews.pro/news/tokenzip-v2-prd-hld-lld.txt", "jsonld": "https://wpnews.pro/news/tokenzip-v2-prd-hld-lld.jsonld"}}