# TokenZip v2 — PRD, HLD, LLD

> Source: <https://gist.github.com/devilankur18/ee2402e656fa4eaa076bdf2c79fcc6b8>
> Published: 2026-05-11 08:22:43+00:00

# TokenZip — PRD, HLD, LLD

---

# 📋 PRD — Product Requirements Document

## 1. Executive Summary

**TokenZip v2** transforms Karpathy's llm wiki concept into a gzip like **token compression engine** on top of entire codebase, which can reduce the LLM input token cost upto by 95% when using with Coding Copilots like Claude Code, Codex etc. Instead of generating a flat text summary, it builds a multi-level, queryable, chainable knowledge graph — from repo → modules → files → symbols — stored locally in `.tokenzip/db`, exposed as an **MCP server** for any AI copilot, and kept fresh via **git hooks**

## 2. Problem Statement

| Problem | Impact |
|---|---|
| AI copilots lack structural awareness of large codebases | They hallucinate imports, miss dependencies, suggest changes in wrong modules |
| Text-based token references are flat and non-queryable | Cannot ask "which functions depend on this interface?" or "what modules does this feature span?" |
| No persistent code intelligence layer | Every session re-parses from scratch, wasting tokens and time |
| Documentation (PRD/HLD/LLD/README) is unstructured | AI can't extract workflows, sequence diagrams, or release plans from markdown |
| Cross-language dependency tracking is manual | A SQL schema change affecting 3 TS files is invisible until runtime |
| Cross-repository dependency tracking is manual | The current repository has no awareness of dependent or upstream repositories, including shared interfaces, API contracts, endpoint usage, schema dependencies, or cross-repo integrations — making impact analysis and coordinated changes error-prone
  |
| Version-aware dependency conflicts are difficult to detect | AI copilots and developers lack visibility into incompatible interface versions, breaking API/schema changes, SDK mismatches, or transitive dependency drift across repositories — causing silent integration failures and upgrade risks


## POC Results

### Under 30 seconds indexing time for a codebase with ~1950 files
<img width="1639" height="855" alt="image" src="https://gist.github.com/user-attachments/assets/f19d00a0-19c2-490f-86a6-f67452b6452f" />

### Under 1 seconds lookup.
<img width="909" height="637" alt="image" src="https://gist.github.com/user-attachments/assets/5d25d6b3-c34a-46f1-9cde-857c8e6a69ee" />

  

## 3. Target Users

### Primary
- **AI Copilot Users** (Claude Code, Codex, OpenCode, Kilo Code) — need structured context without token waste
- **Full-stack Developers** working in monorepos with 50+ modules

### Secondary
- **Tech Leads** auditing codebase structure and dependency health
- **Onboarding Engineers** needing rapid codebase mental model

## 4. Product Vision

> *"Your codebase as a queryable graph — not a text dump. Ask structural questions, get precise answers, zero hallucination."*

## 5. Feature Specification

### 5.1 Multi-Level Code Graph

```
Repository
  └── Module (auto-detected: package.json, pyproject.toml, go.mod, Cargo.toml, etc.)
        └── File
              └── Symbol (function, class, interface, variable, table, column, etc.)
```

**Acceptance Criteria:**
- [ ] Auto-detect module boundaries by presence of manifest files
- [ ] Support nested modules (monorepo: repo → apps/web → src/components)
- [ ] Each node has a stable UUID that survives renames (content-hash + path-hash hybrid)

### 5.2 Tree-Sitter Metadata Extraction

| Language | Extracted Artifacts |
|---|---|
| `.js`, `.mjs` | Functions, classes, exports, imports, global vars, JSDoc |
| `.ts`, `.tsx` | Above + interfaces, type aliases, generics, enums, decorators, namespace exports |
| `.py` | Functions, classes, decorators, type hints, imports, async defs |
| `.sql` | Tables, views, columns, constraints, indexes, foreign keys, stored procedures |
| `.go` | Functions, structs, interfaces, methods, packages, imports |
| `.rs` | Functions, structs, traits, impls, enums, mods, use statements |
| `.java`, `.kt` | Classes, interfaces, methods, annotations, packages |
| `.md` (special) | Headings, lists, code blocks, mermaid diagrams, tables, frontmatter |

**Acceptance Criteria:**
- [ ] Each symbol stored as a node with: name, kind, signature, line range, hash, docstring
- [ ] Relationships: `CALLS`, `IMPLEMENTS`, `INHERITS`, `IMPORTS`, `EXPORTS`, `MODIFIES`, `READS`
- [ ] Incremental parse: only re-parse files whose content hash changed
- [ ] Parse errors stored as node metadata (not silently dropped)

### 5.3 Documentation Intelligence

For structured markdown files (`.prd.md`, `.hld.md`, `.lld.md`, `README.md`, `CHANGELOG.md`, `ADR/*.md`):

| Section Type | Extracted Structure |
|---|---|
| `## Workflow` / `## Flow` | Ordered step graph with actors and actions |
| `## Sequence Diagram` | Parsed mermaid `sequenceDiagram` into actor→message→actor edges |
| `## Flowchart` | Parsed mermaid `flowchart` into decision/action node graph |
| `## Release Plan` | Timeline with milestones, versions, dates |
| `## API` | Endpoint → method → params → response schema |
| `## Architecture` / `## Components` | Component hierarchy with responsibility and tech stack |
| `## Decision` (ADR) | Context → Decision → Consequences as structured tuple |
| Standard lists | Typed list items (checkbox, numbered, bullet) with nesting |
| Tables | Columnar data as records |

**Acceptance Criteria:**
- [ ] Mermaid blocks parsed into graph nodes, not stored as raw text
- [ ] Section-level linking: a workflow step can reference a function symbol node
- [ ] Cross-reference resolution: `[see ModuleX]` in PRD links to Module node in graph

### 5.4 Chainable Query API

```typescript
// Level 1: Repository
const repo = tz.repo('.');

// Level 2: Modules (filterable, chainable)
const feModules = repo.modules().filter(m => m.language === 'typescript');

// Level 3: Files within modules
const tsFiles = feModules.files().filter(f => f.ext === '.tsx');

// Level 4: Symbols within files
const exportedComponents = tsFiles.symbols()
  .filter(s => s.kind === 'class' && s.isExported && s.extends('React.Component'));

// Cross-cutting queries
const dependants = tz.repo('.').symbol('UserService.authenticate')
  .dependants()                    // who calls this?
  .withinModule('api-gateway')     // scope it
  .withKind('function');           // filter

const impact = tz.repo('.').table('users')
  .columns()                       // what columns
  .referencedBy()                  // where are they referenced
  .files();                        // which files

const workflow = tz.repo('.').doc('prd.md')
  .section('Workflow: User Onboarding')
  .steps()                         // ordered steps
  .linkedSymbols();                 // what code implements each step
```

**Acceptance Criteria:**
- [ ] Every level returns a query builder, not raw data (lazy evaluation)
- [ ] `.toArray()`, `.toGraph()`, `.toMarkdown()`, `.toJSON()` terminal methods
- [ ] Queries translate to SurrealDB graph traversal queries
- [ ] Response < 100ms for repos up to 100K files

### 5.5 Graph Database Storage

- **Engine:** SurrealDB (embedded via RocksDB storage)
- **Location:** `<project_root>/.tokenzip/db/`
- **Schema:** Schemaful (strict types per node kind)
- **Persistence:** WAL-enabled, crash-safe

**Acceptance Criteria:**
- [ ] `.tokenzip/` added to `.gitignore` automatically
- [ ] DB size < 10% of source code size for typical repos
- [ ] Cold start (first full parse) completes at > 500 files/second
- [ ] Hot start (incremental) completes at > 2000 files/second

### 5.6 Git Hook Integration

```bash
# Installed via: tokenzip init
# Creates .git/hooks/pre-commit and .git/hooks/post-commit

pre-commit:
  1. Detect staged files (git diff --cached --name-only)
  2. Parse changed files with tree-sitter
  3. Diff new AST against stored graph
  4. Validate: no broken exports, no orphan imports
  5. Update graph with new symbol nodes/edges
  6. If validation fails: warn (configurable: warn/block)

post-commit:
  1. Store commit metadata (hash, message, author, timestamp)
  2. Create COMMIT → MODIFIED → FILE edges
  3. Update file-level git history nodes
```

**Acceptance Criteria:**
- [ ] Hook installation is non-destructive (appends to existing hooks)
- [ ] Hook execution adds < 500ms to commit time for typical changes (< 10 files)
- [ ] `tokenzip init --no-hooks` flag for CI environments
- [ ] `tokenzip status` shows graph health (stale files, broken references)

### 5.7 MCP Server

```jsonc
// Exposed to any MCP-compatible client
{
  "tools": [
    "query_repo_structure",
    "query_module", 
    "query_file",
    "query_symbol",
    "get_dependencies",
    "get_dependants",
    "search_symbols",
    "get_git_history",
    "get_workflow",
    "get_impact_analysis",
    "execute_workflow_template"
  ],
  "resources": [
    "tokenzip://repo/structure",
    "tokenzip://module/{name}/overview",
    "tokenzip://file/{path}/symbols",
    "tokenzip://symbol/{id}/detail"
  ]
}
```

**Acceptance Criteria:**
- [ ] MCP server starts in < 200ms
- [ ] All tools return structured JSON (never raw text dumps)
- [ ] Token budget aware: responses include `token_count` metadata
- [ ] Works with Claude Code, Codex, OpenCode, Kilo Code without config changes
- [ ] Concurrent tool calls supported (SurrealDB connection pooling)

### 5.8 Workflow Templates

| Workflow | Input | Output | Graph Operations |
|---|---|---|---|
| **Create Module** | module name, type, dependencies | Scaffolded structure + graph nodes | CREATE module, CREATE files, CREATE IMPORTS edges |
| **Update Module** | module name, change description | Affected files + symbols list | READ dependants, READ dependents, DIFF graph |
| **Implement Feature** | feature description, target module | Files to create/modify, symbol gaps | SEARCH related symbols, PATH analysis, IMPACT query |
| **Upgrade Feature** | feature name, upgrade description | Migration plan + affected modules | SUBGRAPH extraction, DEPENDENCY chain analysis |
| **Bug Fix** | error message / stack trace | Root cause candidates + impact radius | TRACE call chain, FIND modified symbols in git blame range |

**Acceptance Criteria:**
- [ ] Each workflow is a deterministic graph query sequence, not LLM-generated
- [ ] Workflows return structured data that an LLM can act on (not final answers)
- [ ] Workflow results are cached and timestamped in the graph

## 6. Non-Functional Requirements

| Category | Requirement |
|---|---|
| **Performance** | Full index of 100K file repo < 3 minutes; incremental update < 2 seconds |
| **Memory** | MCP server idle < 50MB; parsing peak < 500MB |
| **Reliability** | Never corrupt the graph on crash; WAL recovery on restart |
| **Compatibility** | Node.js 20+, macOS 12+, Ubuntu 22.04+, Windows WSL2 |
| **Security** | No network calls; all data local; no code execution from graph |
| **Extensibility** | New language support via plugin (tree-sitter grammar + extractor config) |

## 7. Success Metrics

| Metric | Target |
|---|---|
| Copilot context accuracy (relevant vs irrelevant tokens) | > 85% (vs ~40% with text dump) |
| Time to first useful query after `tokenzip init` | < 5 minutes for 50K file repo |
| Hook overhead per commit | < 500ms |
| MCP tool call latency (p95) | < 200ms |
| Graph size efficiency | < 10% of source size |

## 8. Out of Scope (v2)

- Remote graph synchronization (multi-developer shared graph)
- LLM-powered code generation (this is a context layer, not a code writer)
- Runtime analysis (only static analysis via tree-sitter)
- Binary file parsing (images, compiled artifacts)
- IDE plugin (VS Code extension is v3)

## 9. Release Phases

| Phase | Scope | Timeline |
|---|---|---|
| **Alpha** | Core graph + JS/TS parsing + MCP server + basic queries | Week 1-3 |
| **Beta** | All languages + git hooks + documentation intelligence | Week 4-6 |
| **RC** | Workflow templates + chainable API polish + perf tuning | Week 7-8 |
| **GA** | Stability hardening + plugin system + docs | Week 9-10 |

---

# 🏗️ HLD — High-Level Design

## 1. Architecture Overview

TokenZip v2 is a **local-first, static-analysis graph engine** with four layers:

```
┌─────────────────────────────────────────────────────────────────┐
│                    LAYER 4: INTEGRATION                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌───────────────┐  │
│  │ Claude   │  │ Codex    │  │ OpenCode │  │ Kilo Code     │  │
│  │ Code     │  │          │  │          │  │               │  │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └──────┬────────┘  │
│       │              │              │                │           │
│       └──────────────┴──────┬───────┴────────────────┘           │
│                             │ MCP Protocol (stdio/SSE)          │
├─────────────────────────────┼───────────────────────────────────┤
│                    LAYER 3: API & QUERY                         │
│  ┌──────────────────────────┴──────────────────────────────┐   │
│  │                    MCP Server                            │   │
│  │  ┌─────────────────┐  ┌──────────────────────────────┐  │   │
│  │  │  Tool Registry  │  │  Resource Registry            │  │   │
│  │  └────────┬────────┘  └──────────────┬───────────────┘  │   │
│  │           └──────────┬───────────────┘                  │   │
│  │              ┌───────┴────────┐                         │   │
│  │              │ Chainable Query│                         │   │
│  │              │ Builder (CQB)  │                         │   │
│  │              └───────┬────────┘                         │   │
│  └──────────────────────┼──────────────────────────────────┘   │
├──────────────────────────┼──────────────────────────────────────┤
│                    LAYER 2: ENGINE                              │
│  ┌───────────────────────┼──────────────────────────────────┐  │
│  │  ┌────────────┐  ┌────┴─────┐  ┌──────────┐  ┌───────┐  │  │
│  │  │ Tree-Sitter│  │ Markdown │  │ Workflow │  │ Graph │  │  │
│  │  │ Extractor  │  │ Parser   │  │ Engine   │  │ Query │  │  │
│  │  │ (per lang) │  │ (struct) │  │ (tpl)    │  │ Planner│  │  │
│  │  └─────┬──────┘  └────┬─────┘  └────┬─────┘  └───┬───┘  │  │
│  │        └──────────────┼──────────────┼────────────┘      │  │
│  │              ┌───────┴──────────────┴───────┐            │  │
│  │              │     Graph Mutation Engine     │            │  │
│  │              │  (diff, merge, validate)      │            │  │
│  │              └───────────────┬───────────────┘            │  │
│  └──────────────────────────────┼────────────────────────────┘  │
├──────────────────────────────┼─────────────────────────────────┤
│                    LAYER 1: STORAGE                            │
│  ┌───────────────────────────┼─────────────────────────────┐  │
│  │              ┌────────────┴────────────┐                 │  │
│  │              │   Storage Abstraction   │                 │  │
│  │              │   (IStore interface)    │                 │  │
│  │              └────────────┬────────────┘                 │  │
│  │        ┌──────────────────┼──────────────────┐           │  │
│  │  ┌─────┴──────┐    ┌─────┴──────┐    ┌─────┴──────┐     │  │
│  │  │ SurrealDB  │    │  SQLite    │    │  In-Memory │     │  │
│  │  │ (primary)  │    │ (fallback) │    │  (tests)   │     │  │
│  │  └────────────┘    └────────────┘    └────────────┘     │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                    SIDE CHANNELS                                │
│  ┌──────────────┐  ┌───────────────┐  ┌────────────────────┐  │
│  │ Git Hooks    │  │ File Watcher  │  │ CLI (tokenzip)     │  │
│  │ pre-commit   │  │ (optional)    │  │ init, parse, query │  │
│  │ post-commit  │  │ chokidar      │  │ status, serve      │  │
│  └──────────────┘  └───────────────┘  └────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
```

## 2. Component Design

### 2.1 Tree-Sitter Extractor

```
                    ┌─────────────────────┐
                    │  File Input Stream  │
                    └──────────┬──────────┘
                               │
                    ┌──────────┴──────────┐
                    │  Language Detector  │
                    │  (extension + shebang│
                    │   + .editorconfig)  │
                    └──────────┬──────────┘
                               │
              ┌────────────────┼────────────────┐
              │                │                │
     ┌────────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
     │ Code Extractor│ │ SQL Extract.│ │ MD Extractor│
     │ (JS/TS/Py/Go │ │ (Tables,    │ │ (Sections,  │
     │  /Rs/Java/Kt) │ │  Columns,   │ │  Mermaid,   │
     │               │ │  FKs, SPs)  │ │  Lists,     │
     │               │ │             │ │  Tables)    │
     └───────┬───────┘ └──────┬──────┘ └──────┬──────┘
             │                │                │
             └────────────────┼────────────────┘
                              │
                    ┌─────────┴──────────┐
                    │  Symbol Graph      │
                    │  (nodes + edges)   │
                    └────────────────────┘
```

**Key Design Decision:** Extractors produce an **intermediate representation (IR)** — a flat list of `SymbolNode` and `SymbolEdge` objects — regardless of source language. This decouples parsing from storage.

### 2.2 Chainable Query Builder (CQB)

```
QueryBuilder
  ├── .repo(path)          → RepoScope
  │     ├── .modules()     → ModuleScope
  │     │     ├── .files() → FileScope
  │     │     │     ├── .symbols() → SymbolScope
  │     │     │     ├── .tables()  → TableScope
  │     │     │     └── .sections()→ SectionScope
  │     │     ├── .dependencies()  → ModuleScope (external deps)
  │     │     └── .dependants()    → ModuleScope
  │     ├── .files()       → FileScope (all files, no module filter)
  │     ├── .symbols()     → SymbolScope (global search)
  │     ├── .tables()      → TableScope
  │     └── .docs()        → DocScope
  ├── .symbol(name)        → SymbolScope (direct lookup)
  ├── .table(name)         → TableScope
  ├── .commit(hash)        → CommitScope
  └── .workflow(name)      → WorkflowScope

Every Scope has:
  ├── .filter(predicate)   → same Scope (adds WHERE clause)
  ├── .sort(field, dir)    → same Scope
  ├── .limit(n)            → same Scope
  ├── .offset(n)           → same Scope
  └── Terminal methods:
        ├── .toArray()     → SymbolNode[]
        ├── .toGraph()     → { nodes: [], edges: [] }
        ├── .toMarkdown()  → string
        ├── .toJSON()      → string
        ├── .count()       → number
        └── .exists()      → boolean
```

### 2.3 MCP Server Architecture

```
┌─────────────────────────────────────────────┐
│              MCP Server                      │
│                                              │
│  ┌─────────────────────────────────────┐    │
│  │         Transport Layer              │    │
│  │  ┌──────────┐    ┌───────────────┐  │    │
│  │  │  stdio   │    │  SSE/HTTP     │  │    │
│  │  │ (default)│    │ (optional)    │  │    │
│  │  └────┬─────┘    └──────┬────────┘  │    │
│  └───────┼──────────────────┼───────────┘    │
│          └──────────┬───────┘                │
│              ┌─────┴──────┐                  │
│              │  Protocol  │                  │
│              │  Handler   │                  │
│              └─────┬──────┘                  │
│                    │                         │
│  ┌─────────────────┼─────────────────────┐  │
│  │            Tool Dispatcher            │  │
│  │  ┌──────────┐ ┌──────────┐ ┌────────┐ │  │
│  │  │ Structure│ │ Search   │ │ Impact │ │  │
│  │  │ Tools    │ │ Tools    │ │ Tools  │ │  │
│  │  └────┬─────┘ └────┬─────┘ └───┬────┘ │  │
│  │       └─────────────┼───────────┘      │  │
│  │              ┌──────┴──────┐           │  │
│  │              │    CQB      │           │  │
│  │              │  (shared)   │           │  │
│  │              └──────┬──────┘           │  │
│  └─────────────────────┼──────────────────┘  │
│                        │                     │
│  ┌─────────────────────┼──────────────────┐  │
│  │          Token Budget Manager          │  │
│  │  - Estimates response token count      │  │
│  │  - Truncates if over budget            │  │
│  │  - Prioritizes: symbols > files > mods │  │
│  └─────────────────────────────────────────┘  │
└─────────────────────────────────────────────┘
```

### 2.4 Git Hook Pipeline

```
pre-commit trigger
       │
       ▼
┌──────────────────┐
│ git diff --cached │
│ --name-only       │
└───────┬──────────┘
        │ staged file paths
        ▼
┌──────────────────┐
│ Content Hash     │  ← SHA256 of file content
│ Check            │  ← Compare with stored hash
└───────┬──────────┘
        │ changed files only
        ▼
┌──────────────────┐
│ Tree-Sitter      │  ← Parallel parse (worker threads)
│ Batch Parse      │
└───────┬──────────┘
        │ new symbol IR
        ▼
┌──────────────────┐
│ Graph Diff       │  ← Old symbols vs new symbols
│ & Merge          │  ← Update nodes, edges, hashes
└───────┬──────────┘
        │
        ▼
┌──────────────────┐
│ Validation       │  ← Check: broken exports, orphan imports,
│ (optional)       │     missing type references
└───────┬──────────┘
        │
   ┌────┴────┐
   │         │
   ▼         ▼
PASS      FAIL
   │         │
   ▼         ▼
Continue   Warn/Block
Commit     (configurable)
```

## 3. Data Model (Graph Schema)

### 3.1 Node Types

```
┌─────────────────────────────────────────────────────────────────┐
│ NODE: repository                                                 │
│   id:        string (record ID)                                  │
│   name:      string                                              │
│   root:      string (absolute path)                              │
│   created_at: datetime                                           │
│   updated_at: datetime                                           │
│   stats:     { files: number, modules: number, symbols: number } │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ NODE: module                                                     │
│   id:            string                                          │
│   name:          string                                          │
│   path:          string (relative to repo root)                  │
│   manifest_type: string (package.json | pyproject.toml | ...)    │
│   language:      string (primary language)                       │
│   is_root:       bool                                            │
│   metadata:      { name, version, description, ... }             │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ NODE: file                                                       │
│   id:          string                                            │
│   path:        string (relative to repo root)                    │
│   module_id:   string (reference to module)                      │
│   language:    string                                            │
│   ext:         string                                            │
│   size_bytes:  number                                            │
│   content_hash: string (SHA256)                                  │
│   line_count:  number                                            │
│   parse_status: string (parsed | partial | failed | skipped)     │
│   parse_error:  option<string>                                   │
│   last_parsed: datetime                                          │
│   git_last_modified: option<datetime>                            │
│   git_blame_summary: option<{ author, date, commit_count }>      │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ NODE: symbol (polymorphic by kind)                               │
│   id:            string                                          │
│   file_id:       string                                          │
│   name:          string                                          │
│   kind:          enum {                                          │
│     function, method, constructor,                               │
│     class, interface, type_alias, enum,                          │
│     variable, constant, property,                                │
│     parameter, generic_param,                                    │
│     decorator, annotation,                                       │
│     table, view, column, index, constraint,                      │
│     foreign_key, stored_procedure,                               │
│     import, export, re_export,                                   │
│     namespace, module_decl,                                      │
│     section, subsection,                                         │
│     workflow_step, diagram_node,                                 │
│     list_item, table_row                                         │
│   }                                                             │
│   signature:     option<string>  (full signature text)           │
│   return_type:   option<string>                                  │
│   start_line:    number                                          │
│   end_line:      number                                          │
│   start_col:     number                                          │
│   end_col:       number                                          │
│   docstring:     option<string>                                  │
│   is_exported:   bool                                            │
│   is_async:      option<bool>                                    │
│   is_static:     option<bool>                                    │
│   visibility:    option<enum { public, private, protected }>     │
│   modifiers:     array<string>                                   │
│   parent_symbol_id: option<string> (for nested symbols)          │
│   metadata:      object (language-specific extras)               │
│     // For tables: { schema, engine, columns: [...] }           │
│     // For classes: { implements: [...], extends: ... }         │
│     // For functions: { params: [...], generics: [...] }        │
│     // For sections: { level, anchor_id }                       │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ NODE: commit                                                     │
│   id:        string                                             │
│   hash:      string (full SHA)                                  │
│   short_hash: string (7 char)                                   │
│   message:   string                                             │
│   author:    string                                             │
│   email:     string                                             │
│   date:      datetime                                           │
│   branch:    string                                             │
│   tags:      array<string>                                      │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ NODE: dependency (external)                                      │
│   id:          string                                            │
│   module_id:   string (which module depends on it)               │
│   name:        string (npm package name, pip package, etc.)      │
│   version:     string (resolved version)                         │
│   dev:         bool                                              │
│   source:      string (npm, pip, cargo, go modules, maven)       │
└─────────────────────────────────────────────────────────────────┘
```

### 3.2 Edge Types

```
EDGE: contains
  FROM: repository  → TO: module
  FROM: module      → TO: file
  FROM: file        → TO: symbol
  FROM: symbol      → TO: symbol (nested: class → method)

EDGE: imports
  FROM: file    → TO: file       (file-level import)
  FROM: module  → TO: module     (module-level dependency)
  FROM: symbol  → TO: symbol     (symbol-level import)
  METADATA: { is_type_only: bool, is_default: bool, alias: option<string> }

EDGE: exports
  FROM: file   → TO: symbol
  FROM: symbol → TO: symbol       (re-export chain)
  METADATA: { is_default: bool, is_reexport: bool, alias: option<string> }

EDGE: calls
  FROM: symbol (function/method) → TO: symbol (function/method)
  METADATA: { line: number, is_async: bool, call_type: enum { direct, indirect, dynamic } }

EDGE: implements
  FROM: symbol (class) → TO: symbol (interface)
  METADATA: { is_partial: bool }

EDGE: inherits
  FROM: symbol (class/interface) → TO: symbol (class/interface)
  METADATA: { is_interface_inheritance: bool }

EDGE: modifies
  FROM: symbol (function) → TO: symbol (variable/table/column)

EDGE: reads
  FROM: symbol (function) → TO: symbol (variable/table/column)

EDGE: references
  FROM: symbol → TO: symbol (generic "uses" relationship)
  METADATA: { context: string }

EDGE: depends_on
  FROM: module → TO: module (transitive closure of imports)
  FROM: file   → TO: file
  METADATA: { is_transitive: bool, depth: number }

EDGE: depended_by  (computed reverse of depends_on)

EDGE: modified_in
  FROM: file   → TO: commit
  METADATA: { change_type: enum { added, modified, deleted, renamed } }

EDGE: authored_by
  FROM: file/symbol → TO: commit (latest commit touching this artifact)

EDGE: belongs_to_workflow
  FROM: symbol → TO: symbol (workflow_step)

EDGE: workflow_transition
  FROM: symbol (workflow_step) → TO: symbol (workflow_step)
  METADATA: { condition: option<string>, action: option<string> }

EDGE: diagram_edge
  FROM: symbol (diagram_node) → TO: symbol (diagram_node)
  METADATA: { label: string, style: string, type: enum { solid, dashed, dotted, bold } }

EDGE: foreign_key
  FROM: symbol (column) → TO: symbol (table)
  METADATA: { constraint_name: string, on_delete: string, on_update: string }

EDGE: column_of
  FROM: symbol (column/index/constraint) → TO: symbol (table)
```

### 3.3 Indexes

```
DEFINE INDEX idx_file_path      ON file   FIELDS path         UNIQUE
DEFINE INDEX idx_file_hash      ON file   FIELDS content_hash
DEFINE INDEX idx_file_module    ON file   FIELDS module_id
DEFINE INDEX idx_symbol_name    ON symbol FIELDS name
DEFINE INDEX idx_symbol_kind    ON symbol FIELDS kind
DEFINE INDEX idx_symbol_file    ON symbol FIELDS file_id
DEFINE INDEX idx_symbol_export  ON symbol FIELDS is_exported
DEFINE INDEX idx_module_path    ON module FIELDS path          UNIQUE
DEFINE INDEX idx_commit_hash    ON commit FIELDS hash          UNIQUE
DEFINE INDEX idx_dep_name       ON dependency FIELDS name, module_id
```

## 4. Technology Stack

| Component | Technology | Rationale |
|---|---|---|
| Runtime | Node.js 20+ (ESM) | Universal, tree-sitter bindings available, MCP SDK native |
| Tree-Sitter | `tree-sitter` + language grammars | Industry standard, incremental parsing, multi-language |
| Graph DB | SurrealDB v2 (embedded/RocksDB) | Native graph queries, schemaful, embedded mode, no server |
| Fallback DB | better-sqlite3 | Zero-config fallback if SurrealDB unavailable |
| MCP | `@modelcontextprotocol/sdk` | Official SDK, stdio + SSE transport |
| CLI | `commander` | Battle-tested CLI framework |
| Git | `simple-git` | Promise-based git operations |
| File Watch | `chokidar` | Cross-platform, efficient |
| Logging | `pino` | Structured, fast |
| Testing | `vitest` + `memfs` | Fast, in-memory FS for unit tests |
| Bundling | `tsup` | ESM + CJS dual output, tree-shaking |
| Markdown | `unified` + `remark` + `rehype` | Pluggable markdown AST pipeline |
| Mermaid | `mermaid` (headless) | Parse mermaid diagrams to structured data |

## 5. Integration Architecture

### 5.1 MCP Integration Points

```
Claude Code / Codex / OpenCode
         │
         │  MCP Protocol (JSON-RPC 2.0 over stdio)
         │
    ┌────┴─────┐
    │ MCP      │
    │ Server   │
    └────┬─────┘
         │
    ┌────┴──────────────────────────────────┐
    │              Tool Calls               │
    │                                       │
    │  1. query_repo_structure              │
    │     → Returns module tree + stats     │
    │                                       │
    │  2. query_symbol { name, scope }      │
    │     → Symbol node + edges             │
    │                                       │
    │  3. get_impact_analysis { symbol_id } │
    │     → Dependents + transitive closure │
    │                                       │
    │  4. search_symbols { query, filters } │
    │     → Fuzzy match on name/signature   │
    │                                       │
    │  5. get_workflow { doc, section }     │
    │     → Structured workflow + links     │
    │                                       │
    │  6. get_git_history { path, limit }   │
    │     → Commit chain for file/symbol    │
    │                                       │
    │  7. execute_workflow_template {       │
    │       type, params }                  │
    │     → Structured analysis result      │
    │                                       │
    │  8. get_dependencies { module_id }    │
    │     → Internal + external deps        │
    │                                       │
    │  9. get_dependants { symbol_id }      │
    │     → Reverse dependency chain        │
    │                                       │
    │  10. get_context_for_files {          │
    │        paths, max_tokens }            │
    │      → Token-budget-aware context     │
    │                                       │
    └───────────────────────────────────────┘
```

### 5.2 Claude Code MCP Config (auto-generated)

```json
{
  "mcpServers": {
    "tokenzip": {
      "command": "npx",
      "args": ["tokenzip", "serve", "--cwd", "/path/to/project"],
      "env": {}
    }
  }
}
```

## 6. Security Considerations

- **No network**: All data stays local. SurrealDB binds to `127.0.0.1` only if HTTP transport used.
- **No code execution**: Graph stores metadata only. No eval, no require from stored data.
- **Path traversal protection**: All file paths resolved and canonicalized before storage.
- **Git hook safety**: Hooks are read-only from git's perspective (never force-push, never amend).
- **`.tokenzip/` in `.gitignore`**: Automatically appended, never committed.
- **Token budget**: MCP responses capped at configurable token limit to prevent context overflow.

## 7. Deployment Model

```
Local Developer Machine
│
├── ~/.tokenzip/
│   ├── config.json          # Global config
│   ├── surrealdb/           # Shared SurrealDB binary (if not system-installed)
│   └── cache/               # Cross-project cache
│
└── <project-root>/
    ├── .tokenzip/
    │   ├── db/              # SurrealDB data directory
    │   │   ├── data.db      # RocksDB storage
    │   │   └── lock         # Process lock
    │   ├── config.json      # Project-specific config
    │   │   ├── languages: [...]
    │   │   ├── excluded: [...]
    │   │   ├── hooks: { preCommit: "warn" | "block" | "off" }
    │   │   └── mcp: { maxTokens: 8000, transport: "stdio" }
    │   └── state.json       # Parse state, last commit, version
    │
    ├── .git/
    │   └── hooks/
    │       ├── pre-commit   # Appended tokenzip hook
    │       └── post-commit  # Appended tokenzip hook
    │
    └── .gitignore           # Contains .tokenzip/
```

---

# 🔧 LLD — Low-Level Design

## 1. Module Structure

```
tokenzip/
├── src/
│   ├── index.ts                    # Public API entry point
│   │
│   ├── cli/                        # CLI layer
│   │   ├── index.ts                # Commander setup
│   │   ├── commands/
│   │   │   ├── init.ts             # tokenzip init
│   │   │   ├── parse.ts            # tokenzip parse [--full | --incremental]
│   │   │   ├── query.ts            # tokenzip query <cqb-expression>
│   │   │   ├── status.ts           # tokenzip status
│   │   │   ├── serve.ts            # tokenzip serve [--transport stdio|sse] [--port 3000]
│   │   │   ├── hooks.ts            # tokenzip hooks install|uninstall
│   │   │   └── clean.ts            # tokenzip clean
│   │   └── utils/
│   │       └── spinner.ts
│   │
│   ├── mcp/                        # MCP server layer
│   │   ├── server.ts               # MCP server creation & setup
│   │   ├── transport/
│   │   │   ├── stdio.ts
│   │   │   └── sse.ts
│   │   ├── tools/
│   │   │   ├── registry.ts         # Tool registration
│   │   │   ├── structure.ts        # query_repo_structure, query_module
│   │   │   ├── symbol.ts           # query_symbol, search_symbols
│   │   │   ├── dependency.ts       # get_dependencies, get_dependants
│   │   │   ├── impact.ts           # get_impact_analysis
│   │   │   ├── git.ts              # get_git_history
│   │   │   ├── workflow.ts         # get_workflow, execute_workflow_template
│   │   │   └── context.ts          # get_context_for_files
│   │   ├── resources/
│   │   │   ├── registry.ts
│   │   │   ├── repo.ts
│   │   │   ├── module.ts
│   │   │   ├── file.ts
│   │   │   └── symbol.ts
│   │   └── token-budget.ts         # Token estimation & truncation
│   │
│   ├── query/                      # Chainable Query Builder
│   │   ├── builder.ts              # Base QueryBuilder class
│   │   ├── scopes/
│   │   │   ├── repo-scope.ts
│   │   │   ├── module-scope.ts
│   │   │   ├── file-scope.ts
│   │   │   ├── symbol-scope.ts
│   │   │   ├── table-scope.ts
│   │   │   ├── commit-scope.ts
│   │   │   ├── doc-scope.ts
│   │   │   └── workflow-scope.ts
│   │   ├── filters.ts              # Filter predicate parser
│   │   ├── translators/
│   │   │   ├── surrealql.ts        # CQB → SurrealQL translation
│   │   │   └── sql.ts              # CQB → SQL translation (SQLite fallback)
│   │   └── types.ts
│   │
│   ├── engine/                     # Core engine layer
│   │   ├── indexer.ts              # Full & incremental indexing orchestrator
│   │   ├── differ.ts               # Graph diff: old symbols vs new symbols
│   │   ├── merger.ts               # Merge diff into graph
│   │   ├── validator.ts            # Reference integrity validation
│   │   ├── module-detector.ts      # Detect module boundaries
│   │   └── language-detector.ts    # Detect language from extension + content
│   │
│   ├── extractor/                  # Tree-sitter extraction layer
│   │   ├── base-extractor.ts       # Abstract extractor interface
│   │   ├── registry.ts             # Language → extractor mapping
│   │   ├── code/
│   │   │   ├── javascript.ts       # JS/JSX extractor
│   │   │   ├── typescript.ts       # TS/TSX extractor
│   │   │   ├── python.ts
│   │   │   ├── go.ts
│   │   │   ├── rust.ts
│   │   │   ├── java.ts
│   │   │   └── kotlin.ts
│   │   ├── sql/
│   │   │   └── sql.ts              # SQL extractor (tables, columns, FKs)
│   │   ├── markdown/
│   │   │   ├── markdown.ts         # Markdown structure extractor
│   │   │   ├── mermaid.ts          # Mermaid diagram parser
│   │   │   └── sections.ts         # Section type classifier
│   │   └── types.ts                # SymbolIR, EdgeIR types
│   │
│   ├── storage/                    # Storage abstraction layer
│   │   ├── interface.ts            # IStore interface
│   │   ├── surreal/
│   │   │   ├── connection.ts       # Connection pool & lifecycle
│   │   │   ├── migrations.ts       # Schema migration
│   │   │   ├── queries/
│   │   │   │   ├── nodes.ts
│   │   │   │   ├── edges.ts
│   │   │   │   ├── graph.ts
│   │   │   │   └── search.ts
│   │   │   └── store.ts            # SurrealStore implements IStore
│   │   ├── sqlite/
│   │   │   ├── schema.ts           # Table creation
│   │   │   ├── queries/
│   │   │   │   ├── nodes.ts
│   │   │   │   ├── edges.ts
│   │   │   │   └── graph.ts
│   │   │   └── store.ts            # SQLiteStore implements IStore
│   │   ├── memory/
│   │   │   └── store.ts            # MemoryStore for testing
│   │   └── factory.ts              # StoreFactory: config → IStore
│   │
│   ├── hooks/                      # Git hook layer
│   │   ├── installer.ts            # Install hooks into .git/hooks/
│   │   ├── pre-commit.ts           # Pre-commit logic
│   │   ├── post-commit.ts          # Post-commit logic
│   │   └── detector.ts             # Detect staged files
│   │
│   ├── workflows/                  # Workflow template engine
│   │   ├── engine.ts               # Workflow executor
│   │   ├── registry.ts             # Workflow template registry
│   │   └── templates/
│   │       ├── create-module.ts
│   │       ├── update-module.ts
│   │       ├── implement-feature.ts
│   │       ├── upgrade-feature.ts
│   │       └── bug-fix.ts
│   │
│   ├── utils/
│   │   ├── logger.ts
│   │   ├── hash.ts                 # Content hashing (SHA256)
│   │   ├── path.ts                 # Path resolution & normalization
│   │   ├── tokens.ts               # Token estimation (chars/4 for code)
│   │   ├── workers.ts              # Worker thread pool for parsing
│   │   └── version.ts
│   │
│   └── types/
│       ├── graph.ts                # All node & edge types
│       ├── extractor.ts            # Extractor IR types
│       ├── query.ts                # Query builder types
│       └── config.ts               # Configuration types
│
├── grammars/                       # Tree-sitter WASM grammars (bundled)
│   ├── tree-sitter-javascript.wasm
│   ├── tree-sitter-typescript.wasm
│   ├── tree-sitter-python.wasm
│   ├── tree-sitter-go.wasm
│   ├── tree-sitter-rust.wasm
│   ├── tree-sitter-java.wasm
│   ├── tree-sitter-kotlin.wasm
│   └── tree-sitter-sql.wasm
│
├── tests/
│   ├── unit/
│   │   ├── extractor/
│   │   │   ├── javascript.test.ts
│   │   │   ├── typescript.test.ts
│   │   │   ├── python.test.ts
│   │   │   ├── sql.test.ts
│   │   │   └── markdown.test.ts
│   │   ├── query/
│   │   │   └── builder.test.ts
│   │   ├── engine/
│   │   │   ├── differ.test.ts
│   │   │   ├── merger.test.ts
│   │   │   └── module-detector.test.ts
│   │   ├── storage/
│   │   │   └── memory-store.test.ts
│   │   └── hooks/
│   │       └── detector.test.ts
│   ├── integration/
│   │   ├── full-parse.test.ts
│   │   ├── incremental-parse.test.ts
│   │   ├── mcp-server.test.ts
│   │   └── git-hook.test.ts
│   ├── fixtures/
│   │   ├── js-project/
│   │   ├── ts-monorepo/
│   │   ├── python-project/
│   │   ├── sql-project/
│   │   └── mixed-project/
│   └── e2e/
│       └── claude-code.test.ts
│
├── package.json
├── tsconfig.json
├── tsup.config.ts
└── vitest.config.ts
```

## 2. Detailed Component Design

### 2.1 Storage Abstraction (IStore)

```typescript
// src/storage/interface.ts

import type { 
  RepositoryNode, ModuleNode, FileNode, SymbolNode, 
  CommitNode, DependencyNode,
  ContainsEdge, ImportsEdge, ExportsEdge, CallsEdge,
  ImplementsEdge, InheritsEdge, ModifiesEdge, ReadsEdge,
  ReferencesEdge, DependsOnEdge, ModifiedInEdge,
  ForeignKeyEdge, ColumnOfEdge,
  // ... all edge types
} from '../types/graph';

export interface GraphNode {
  id: string;
  type: 'repository' | 'module' | 'file' | 'symbol' | 'commit' | 'dependency';
  [key: string]: unknown;
}

export interface GraphEdge {
  id: string;
  type: string;
  from: string;
  to: string;
  [key: string]: unknown;
}

export interface GraphResult {
  nodes: GraphNode[];
  edges: GraphEdge[];
}

export interface StoreStats {
  nodeCount: Record<string, number>;
  edgeCount: Record<string, number>;
  dbSizeBytes: number;
}

export interface IStore {
  // Lifecycle
  initialize(): Promise<void>;
  close(): Promise<void>;
  migrate(): Promise<void>;
  clear(): Promise<void>;
  stats(): Promise<StoreStats>;

  // Node CRUD
  createNode<T extends GraphNode>(node: T): Promise<T>;
  createNodes<T extends GraphNode>(nodes: T[]): Promise<T[]>;
  getNode<T extends GraphNode>(id: string): Promise<T | null>;
  getNodes(ids: string[]): Promise<GraphNode[]>;
  updateNode<T extends GraphNode>(id: string, patch: Partial<T>): Promise<T>;
  deleteNode(id: string): Promise<void>;
  deleteNodes(ids: string[]): Promise<void>;

  // Edge CRUD
  createEdge<T extends GraphEdge>(edge: T): Promise<T>;
  createEdges<T extends GraphEdge>(edges: T[]): Promise<T[]>;
  getEdges(from: string, type?: string): Promise<GraphEdge[]>;
  getEdgesTo(to: string, type?: string): Promise<GraphEdge[]>;
  deleteEdges(from: string, type?: string): Promise<void>;

  // Graph Queries
  query(surrealQL: string, vars?: Record<string, unknown>): Promise<unknown[]>;
  graphTraversal(
    startId: string,
    edgeTypes: string[],
    direction: 'outbound' | 'inbound' | 'both',
    depth?: number,
    filter?: string
  ): Promise<GraphResult>;

  // Bulk Operations
  batchUpsert(nodes: GraphNode[], edges: GraphEdge[]): Promise<void>;
  
  // Search
  searchNodes(
    type: string, 
    field: string, 
    query: string, 
    limit?: number
  ): Promise<GraphNode[]>;

  // Transactions
  transaction<T>(fn: (store: IStore) => Promise<T>): Promise<T>;
}
```

### 2.2 Tree-Sitter Extractor Interface

```typescript
// src/extractor/base-extractor.ts

import { Parser, Tree } from 'tree-sitter';
import { SymbolIR, EdgeIR } from './types';

export interface ExtractionResult {
  symbols: SymbolIR[];
  edges: EdgeIR[];
  parseErrors: ParseError[];
}

export interface ParseError {
  line: number;
  column: number;
  message: string;
}

export interface ExtractorContext {
  filePath: string;
  relativePath: string;
  content: string;
  contentHash: string;
  tree: Tree;
  language: string;
  moduleId: string;
}

export abstract class BaseExtractor {
  abstract readonly language: string;
  abstract readonly extensions: string[];

  /**
   * Extract symbols and edges from a parsed tree.
   * Called after tree-sitter has parsed the file.
   */
  abstract extract(ctx: ExtractorContext): ExtractionResult;

  /**
   * Post-process extraction results.
   * Resolve internal references, compute derived edges.
   * Default implementation does nothing; subclasses can override.
   */
  postProcess(
    symbols: SymbolIR[], 
    edges: EdgeIR[], 
    ctx: ExtractorContext
  ): { symbols: SymbolIR[]; edges: EdgeIR[] } {
    return { symbols, edges };
  }

  /**
   * Generate a stable ID for a symbol.
   * Must be deterministic for the same symbol in the same file.
   */
  generateSymbolId(
    filePath: string, 
    symbolName: string, 
    kind: string, 
    startLine: number
  ): string {
    // Format: sym:<filepath-hash>:<name>:<kind>:<line>
    const pathHash = this.hashPath(filePath);
    return `sym:${pathHash}:${symbolName}:${kind}:${startLine}`;
  }

  private hashPath(filePath: string): string {
    // First 8 chars of SHA256 of relative path
    return createHash('sha256')
      .update(filePath)
      .digest('hex')
      .slice(0, 8);
  }

  /**
   * Walk the tree-sitter AST with a visitor pattern.
   * Utility method for subclasses.
   */
  protected walk(
    node: Parser.SyntaxNode, 
    visitors: Record<string, (node: Parser.SyntaxNode) => void>
  ): void {
    const visitor = visitors[node.type];
    if (visitor) {
      visitor(node);
    }
    for (let i = 0; i < node.childCount; i++) {
      this.walk(node.child(i)!, visitors);
    }
  }

  /**
   * Extract docstring/JSDoc/comment attached to a node.
   */
  protected extractDocstring(node: Parser.SyntaxNode, content: string): string | null {
    // Look for preceding comment nodes
    const prev = node.previousNamedSibling;
    if (prev && (prev.type === 'comment' || prev.type === 'block_comment' 
        || prev.type === 'docstring' || prev.type === 'jsdoc')) {
      return content.slice(prev.startIndex, prev.endIndex).trim();
    }
    return null;
  }
}
```

### 2.3 TypeScript Extractor (Detailed Example)

```typescript
// src/extractor/code/typescript.ts

import { BaseExtractor, ExtractorContext, ExtractionResult, SymbolIR, EdgeIR } from '../base-extractor';

export class TypeScriptExtractor extends BaseExtractor {
  language = 'typescript';
  extensions = ['.ts', '.tsx', '.mts', '.cts'];

  extract(ctx: ExtractorContext): ExtractionResult {
    const symbols: SymbolIR[] = [];
    const edges: EdgeIR[] = [];
    const parseErrors: ParseError[] = [];

    // Collect parse errors
    this.collectErrors(ctx.tree.rootNode, parseErrors, ctx.content);

    // Visit top-level and nested declarations
    this.walk(ctx.tree.rootNode, {
      // Functions
      'function_declaration': (node) => {
        const name = this.getName(node);
        if (!name) return;
        symbols.push({
          id: this.generateSymbolId(ctx.relativePath, name, 'function', node.startPosition.row + 1),
          fileId: `file:${ctx.relativePath}`,
          name,
          kind: 'function',
          signature: this.getSignature(node, ctx.content),
          returnType: this.getReturnType(node),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractDocstring(node, ctx.content),
          isExported: this.isExported(node),
          isAsync: this.hasModifier(node, 'async'),
          isStatic: false,
          visibility: this.getVisibility(node),
          modifiers: this.getModifiers(node),
          metadata: {
            params: this.extractParams(node, ctx.content),
            generics: this.extractGenerics(node, ctx.content),
            typeParams: this.extractTypeParams(node),
          },
        });
      },

      // Arrow functions assigned to variables
      'variable_declaration': (node) => {
        const declarator = node.childForFieldName('declarator');
        if (!declarator) return;
        const value = declarator.childForFieldName('value');
        if (!value || (value.type !== 'arrow_function' && value.type !== 'function_expression')) return;
        
        const name = this.getName(declarator);
        if (!name) return;

        const funcKind = value.type === 'arrow_function' ? 'function' : 'function';
        symbols.push({
          id: this.generateSymbolId(ctx.relativePath, name, funcKind, node.startPosition.row + 1),
          fileId: `file:${ctx.relativePath}`,
          name,
          kind: funcKind,
          signature: this.getSignature(value, ctx.content),
          returnType: this.getReturnType(value),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractDocstring(node, ctx.content),
          isExported: this.isExported(node),
          isAsync: this.hasModifier(value, 'async'),
          isStatic: false,
          visibility: this.getVisibility(node),
          modifiers: this.getModifiers(node),
          metadata: {
            isArrow: value.type === 'arrow_function',
            params: this.extractParams(value, ctx.content),
            generics: this.extractGenerics(value, ctx.content),
          },
        });
      },

      // Classes
      'class_declaration': (node) => {
        const name = this.getName(node);
        if (!name) return;
        
        const heritage = this.extractHeritage(node); // extends, implements
        const symbolId = this.generateSymbolId(ctx.relativePath, name, 'class', node.startPosition.row + 1);

        symbols.push({
          id: symbolId,
          fileId: `file:${ctx.relativePath}`,
          name,
          kind: 'class',
          signature: this.getSignature(node, ctx.content),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractDocstring(node, ctx.content),
          isExported: this.isExported(node),
          isStatic: false,
          visibility: this.getVisibility(node),
          modifiers: this.getModifiers(node),
          metadata: {
            extends: heritage.extends,
            implements: heritage.implements,
            generics: this.extractGenerics(node, ctx.content),
          },
        });

        // Create inheritance edges
        if (heritage.extends) {
          edges.push({
            type: 'inherits',
            from: symbolId,
            to: `sym:unknown:${heritage.extends}:class:0`, // resolved later
            metadata: { is_interface_inheritance: false },
            isResolved: false,
          });
        }
        for (const impl of heritage.implements) {
          edges.push({
            type: 'implements',
            from: symbolId,
            to: `sym:unknown:${impl}:interface:0`,
            metadata: { is_partial: false },
            isResolved: false,
          });
        }
      },

      // Interfaces
      'interface_declaration': (node) => {
        const name = this.getName(node);
        if (!name) return;

        const extendsList = this.extractInterfaceExtends(node);
        const symbolId = this.generateSymbolId(ctx.relativePath, name, 'interface', node.startPosition.row + 1);

        symbols.push({
          id: symbolId,
          fileId: `file:${ctx.relativePath}`,
          name,
          kind: 'interface',
          signature: this.getSignature(node, ctx.content),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractDocstring(node, ctx.content),
          isExported: this.isExported(node),
          isStatic: false,
          visibility: 'public',
          modifiers: this.getModifiers(node),
          metadata: {
            extends: extendsList,
            generics: this.extractGenerics(node, ctx.content),
            members: this.extractInterfaceMembers(node, ctx.content, ctx.relativePath),
          },
        });

        for (const ext of extendsList) {
          edges.push({
            type: 'inherits',
            from: symbolId,
            to: `sym:unknown:${ext}:interface:0`,
            metadata: { is_interface_inheritance: true },
            isResolved: false,
          });
        }
      },

      // Type aliases
      'type_alias_declaration': (node) => {
        const name = this.getName(node);
        if (!name) return;
        symbols.push({
          id: this.generateSymbolId(ctx.relativePath, name, 'type_alias', node.startPosition.row + 1),
          fileId: `file:${ctx.relativePath}`,
          name,
          kind: 'type_alias',
          signature: this.getTypeAliasBody(node, ctx.content),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractDocstring(node, ctx.content),
          isExported: this.isExported(node),
          isStatic: false,
          visibility: 'public',
          modifiers: [],
          metadata: {
            generics: this.extractGenerics(node, ctx.content),
          },
        });
      },

      // Enums
      'enum_declaration': (node) => {
        const name = this.getName(node);
        if (!name) return;
        const members = this.extractEnumMembers(node, ctx.content);
        symbols.push({
          id: this.generateSymbolId(ctx.relativePath, name, 'enum', node.startPosition.row + 1),
          fileId: `file:${ctx.relativePath}`,
          name,
          kind: 'enum',
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractDocstring(node, ctx.content),
          isExported: this.isExported(node),
          isStatic: false,
          visibility: 'public',
          modifiers: this.getModifiers(node),
          metadata: { members },
        });
      },

      // Imports (file-level)
      'import_statement': (node) => {
        const importInfo = this.extractImport(node, ctx.content);
        if (!importInfo) return;
        
        // Store as symbol for tracking
        symbols.push({
          id: this.generateSymbolId(ctx.relativePath, importInfo.source, 'import', node.startPosition.row + 1),
          fileId: `file:${ctx.relativePath}`,
          name: importInfo.source,
          kind: 'import',
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          isExported: false,
          modifiers: [],
          metadata: {
            source: importInfo.source,
            specifiers: importInfo.specifiers,
            isTypeOnly: importInfo.isTypeOnly,
            isDefault: importInfo.isDefault,
          },
        });

        // Create import edge
        edges.push({
          type: 'imports',
          from: `file:${ctx.relativePath}`,
          to: `file:${this.resolveImportPath(ctx.relativePath, importInfo.source)}`,
          metadata: {
            is_type_only: importInfo.isTypeOnly,
            is_default: importInfo.isDefault,
            specifiers: importInfo.specifiers,
          },
          isResolved: false,
        });
      },

      // Export statements
      'export_statement': (node) => {
        // Handle: export { foo, bar } from './module'
        const exportInfo = this.extractReExport(node, ctx.content);
        if (exportInfo) {
          for (const spec of exportInfo.specifiers) {
            edges.push({
              type: 'exports',
              from: `file:${ctx.relativePath}`,
              to: `file:${this.resolveImportPath(ctx.relativePath, exportInfo.source)}`,
              metadata: {
                is_reexport: true,
                is_default: spec.isDefault,
                alias: spec.alias,
                name: spec.name,
              },
              isResolved: false,
            });
          }
        }
      },

      // Method definitions inside classes
      'method_definition': (node) => {
        // This is handled inside class_declaration visitor
        // We capture it there for parent_symbol_id linking
      },

      // Property definitions inside classes
      'public_field_definition': (node) => {
        // Handled inside class_declaration
      },
    });

    // Post-process: resolve parent_symbol_id for nested symbols
    // Post-process: mark exported symbols
    const processed = this.postProcess(symbols, edges, ctx);

    return {
      symbols: processed.symbols,
      edges: processed.edges,
      parseErrors,
    };
  }

  // ... helper methods (getName, getSignature, extractParams, etc.)
  // Each is ~10-20 lines using tree-sitter child navigation
}
```

### 2.4 SQL Extractor

```typescript
// src/extractor/sql/sql.ts

export class SQLExtractor extends BaseExtractor {
  language = 'sql';
  extensions = ['.sql'];

  extract(ctx: ExtractorContext): ExtractionResult {
    const symbols: SymbolIR[] = [];
    const edges: EdgeIR[] = [];
    const parseErrors: ParseError[] = [];

    this.walk(ctx.tree.rootNode, {
      'create_table': (node) => {
        const tableName = this.getTableName(node);
        if (!tableName) return;
        
        const tableId = this.generateSymbolId(
          ctx.relativePath, tableName, 'table', node.startPosition.row + 1
        );

        // Extract columns
        const columns = this.extractColumns(node, ctx.content, ctx.relativePath, tableId);
        const constraints = this.extractConstraints(node, ctx.content, ctx.relativePath, tableId);
        const indexes = this.extractIndexes(node, ctx.content, ctx.relativePath, tableId);

        symbols.push({
          id: tableId,
          fileId: `file:${ctx.relativePath}`,
          name: tableName,
          kind: 'table',
          signature: this.getTableSignature(node, ctx.content),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractTableComment(node, ctx.content),
          isExported: false,
          modifiers: [],
          metadata: {
            schema: this.getSchemaName(node),
            engine: this.getEngine(node),
            columns: columns.map(c => c.name),
            columnCount: columns.length,
          },
        });

        symbols.push(...columns, ...constraints, ...indexes);

        // Create column_of edges
        for (const col of columns) {
          edges.push({ type: 'column_of', from: col.id, to: tableId });
        }
        for (const idx of indexes) {
          edges.push({ type: 'column_of', from: idx.id, to: tableId });
        }
        for (const con of constraints) {
          edges.push({ type: 'column_of', from: con.id, to: tableId });
        }

        // Extract foreign keys and create FK edges
        const fks = this.extractForeignKeys(node, ctx.content);
        for (const fk of fks) {
          const fromColId = this.generateSymbolId(
            ctx.relativePath, fk.column, 'column', 0 // approximate
          );
          const toTableId = `sym:unknown:${fk.refTable}:table:0`;
          edges.push({
            type: 'foreign_key',
            from: fromColId,
            to: toTableId,
            metadata: {
              constraint_name: fk.name,
              on_delete: fk.onDelete,
              on_update: fk.onUpdate,
              ref_column: fk.refColumn,
            },
            isResolved: false,
          });
        }
      },

      'create_view': (node) => {
        const viewName = this.getViewName(node);
        if (!viewName) return;
        symbols.push({
          id: this.generateSymbolId(ctx.relativePath, viewName, 'view', node.startPosition.row + 1),
          fileId: `file:${ctx.relativePath}`,
          name: viewName,
          kind: 'view',
          signature: this.getViewQuery(node, ctx.content),
          startLine: node.startPosition.row + 1,
          endLine: node.endPosition.row + 1,
          startCol: node.startPosition.column,
          endCol: node.endPosition.column,
          docstring: this.extractViewComment(node, ctx.content),
          isExported: false,
          modifiers: [],
          metadata: { schema: this.getSchemaName(node) },
        });
      },

      'create_procedure': (node) => {
        // Stored procedures / functions
      },
    });

    return { symbols, edges, parseErrors };
  }
}
```

### 2.5 Markdown Extractor

```typescript
// src/extractor/markdown/markdown.ts

import { unified } from 'unified';
import remarkParse from 'remark-parse';
import remarkGfm from 'remark-gfm';
import { visit } from 'unist-util-visit';
import { Root, Heading, Code, List, Table, ListItem } from 'mdast';

export class MarkdownExtractor extends BaseExtractor {
  language = 'markdown';
  extensions = ['.md', '.mdx', '.markdown'];

  extract(ctx: ExtractorContext): ExtractionResult {
    const symbols: SymbolIR[] = [];
    const edges: EdgeIR[] = [];

    const tree = unified()
      .use(remarkParse)
      .use(remarkGfm)
      .parse(ctx.content) as Root;

    let currentSection: string | null = null;
    let sectionCounter = 0;
    let workflowStepCounter = 0;
    let diagramNodeCounter = 0;

    visit(tree, (node) => {
      // Headings → sections
      if (node.type === 'heading') {
        const heading = node as Heading;
        const text = this.getTextContent(heading);
        const level = heading.depth;
        const sectionId = this.generateSymbolId(
          ctx.relativePath, text, 'section', heading.position?.start.line || 0
        );

        const sectionSymbol: SymbolIR = {
          id: sectionId,
          fileId: `file:${ctx.relativePath}`,
          name: text,
          kind: 'section',
          startLine: heading.position?.start.line || 0,
          endLine: heading.position?.end.line || 0,
          startCol: heading.position?.start.column || 0,
          endCol: heading.position?.end.column || 0,
          isExported: false,
          modifiers: [],
          metadata: {
            level,
            anchor_id: this.slugify(text),
            section_type: this.classifySection(text),
          },
        };
        symbols.push(sectionSymbol);

        // Link to parent section
        if (currentSection && level > 1) {
          edges.push({
            type: 'contains',
            from: currentSection,
            to: sectionId,
          });
        }
        currentSection = sectionId;
        sectionCounter++;
      }

      // Code blocks → check for mermaid
      if (node.type === 'code') {
        const code = node as Code;
        if (code.lang === 'mermaid' && code.value) {
          const diagramResult = this.parseMermaid(code.value, ctx);
          symbols.push(...diagramResult.symbols);
          edges.push(...diagramResult.edges);

          // Link diagram to current section
          if (currentSection) {
            for (const sym of diagramResult.symbols) {
              edges.push({ type: 'contains', from: currentSection, to: sym.id });
            }
          }
        }
      }

      // Lists → structured list items
      if (node.type === 'list') {
        const list = node as List;
        this.extractListItems(list, symbols, edges, ctx, currentSection);
      }

      // Tables → structured rows
      if (node.type === 'table') {
        const table = node as Table;
        const tableResult = this.extractTable(table, ctx, currentSection);
        symbols.push(...tableResult.symbols);
        edges.push(...tableResult.edges);
      }
    });

    return { symbols, edges, parseErrors: [] };
  }

  private classifySection(heading: string): string {
    const lower = heading.toLowerCase();
    if (/workflow|flow|process|pipeline/.test(lower)) return 'workflow';
    if (/sequence\s*diagram/.test(lower)) return 'sequence_diagram';
    if (/flowchart/.test(lower)) return 'flowchart';
    if (/release\s*plan|roadmap|timeline/.test(lower)) return 'release_plan';
    if (/api|endpoint/.test(lower)) return 'api';
    if (/architecture|component|system\s*design/.test(lower)) return 'architecture';
    if (/decision|adr/.test(lower)) return 'decision';
    if (/requirement|user\s*story|acceptance/.test(lower)) return 'requirement';
    return 'general';
  }

  private parseMermaid(mermaidCode: string, ctx: ExtractorContext): 
    { symbols: SymbolIR[]; edges: EdgeIR[] } {
    
    const symbols: SymbolIR[] = [];
    const edges: EdgeIR[] = [];

    // Detect diagram type
    const typeMatch = mermaidCode.match(/^(sequenceDiagram|flowchart\s+\w+|stateDiagram|erDiagram|classDiagram|gantt)/m);
    const diagramType = typeMatch?.[1] || 'unknown';

    if (diagramType === 'sequenceDiagram') {
      return this.parseSequenceDiagram(mermaidCode, ctx);
    }
    if (diagramType.startsWith('flowchart')) {
      return this.parseFlowchart(mermaidCode, ctx);
    }
    if (diagramType === 'erDiagram') {
      return this.parseERDiagram(mermaidCode, ctx);
    }
    if (diagramType === 'classDiagram') {
      return this.parseClassDiagram(mermaidCode, ctx);
    }

    // Fallback: store as raw diagram node
    symbols.push({
      id: this.generateSymbolId(ctx.relativePath, `diagram-${Date.now()}`, 'section', 0),
      fileId: `file:${ctx.relativePath}`,
      name: `Mermaid ${diagramType}`,
      kind: 'section',
      startLine: 0,
      endLine: 0,
      startCol: 0,
      endCol: 0,
      isExported: false,
      modifiers: [],
      metadata: { diagram_type: diagramType, raw: mermaidCode },
    });

    return { symbols, edges };
  }

  private parseSequenceDiagram(code: string, ctx: ExtractorContext): 
    { symbols: SymbolIR[]; edges: EdgeIR[] } {
    // Parse:
    //   participant A as Actor A
    //   A->>B: Message
    //   B-->>A: Response
    //
    // Creates: diagram_node per participant
    // Creates: diagram_edge per message (with label, style)
    
    const symbols: SymbolIR[] = [];
    const edges: EdgeIR[] = [];
    const participants = new Map<string, string>(); // alias → full name
    const baseLine = 0; // Would need actual line from parent

    const participantRe = /^participant\s+(\w+)(?:\s+as\s+(.+))?$/gm;
    let match;
    while ((match = participantRe.exec(code)) !== null) {
      const alias = match[1];
      const fullName = match[2] || alias;
      participants.set(alias, fullName);
      
      const id = this.generateSymbolId(ctx.relativePath, alias, 'diagram_node', baseLine);
      symbols.push({
        id,
        fileId: `file:${ctx.relativePath}`,
        name: fullName,
        kind: 'diagram_node',
        startLine: baseLine,
        endLine: baseLine,
        startCol: 0,
        endCol: 0,
        isExported: false,
        modifiers: [],
        metadata: {
          diagram_type: 'sequence_diagram',
          role: 'participant',
          alias,
        },
      });
    }

    // Parse messages: A->>B: text  or  A-->>B: text
    const msgRe = /^(\w+)(->>|-->>|->|-->)\s*(\w+):\s*(.+)$/gm;
    let msgMatch;
    let msgCounter = 0;
    while ((msgMatch = msgRe.exec(code)) !== null) {
      const fromAlias = msgMatch[1];
      const arrowStyle = msgMatch[2];
      const toAlias = msgMatch[3];
      const message = msgMatch[4];

      const fromId = this.generateSymbolId(ctx.relativePath, fromAlias, 'diagram_node', baseLine);
      const toId = this.generateSymbolId(ctx.relativePath, toAlias, 'diagram_node', baseLine);

      // Register participants if not explicitly declared
      if (!participants.has(fromAlias)) {
        participants.set(fromAlias, fromAlias);
        symbols.push({
          id: fromId,
          fileId: `file:${ctx.relativePath}`,
          name: fromAlias,
          kind: 'diagram_node',
          startLine: baseLine, endLine: baseLine,
          startCol: 0, endCol: 0,
          isExported: false, modifiers: [],
          metadata: { diagram_type: 'sequence_diagram', role: 'participant', alias: fromAlias },
        });
      }
      if (!participants.has(toAlias)) {
        participants.set(toAlias, toAlias);
        symbols.push({
          id: toId,
          fileId: `file:${ctx.relativePath}`,
          name: toAlias,
          kind: 'diagram_node',
          startLine: baseLine, endLine: baseLine,
          startCol: 0, endCol: 0,
          isExported: false, modifiers: [],
          metadata: { diagram_type: 'sequence_diagram', role: 'participant', alias: toAlias },
        });
      }

      edges.push({
        type: 'diagram_edge',
        from: fromId,
        to: toId,
        metadata: {
          label: message,
          style: arrowStyle === '->>' ? 'solid' : arrowStyle === '-->>' ? 'dashed' : 'dotted',
          type: 'solid',
          sequence: msgCounter++,
          is_response: arrowStyle.includes('--'),
        },
      });
    }

    return { symbols, edges };
  }

  // ... parseFlowchart, parseERDiagram, parseClassDiagram, extractListItems, extractTable
}
```

### 2.6 Chainable Query Builder — Core

```typescript
// src/query/builder.ts

import { IStore } from '../storage/interface';
import { GraphNode, GraphEdge, GraphResult } from '../types/graph';
import { RepoScope } from './scopes/repo-scope';

export type SortDirection = 'asc' | 'desc';
export type TerminalFormat = 'array' | 'graph' | 'markdown' | 'json';

export interface FilterPredicate {
  field: string;
  op: 'eq' | 'neq' | 'gt' | 'gte' | 'lt' | 'lte' | 'contains' | 'matches' | 'in' | 'exists';
  value: unknown;
}

export abstract class QueryScope<T extends QueryScope<T>> {
  protected filters: FilterPredicate[] = [];
  protected sortField: string | null = null;
  protected sortDir: SortDirection = 'asc';
  protected limitCount: number | null = null;
  protected offsetCount: number = 0;

  constructor(protected store: IStore, protected repoPath: string) {}

  filter(predicate: FilterPredicate | ((item: GraphNode) => boolean)): T {
    const clone = this.clone();
    if (typeof predicate === 'function') {
      // Function filters are applied post-hoc (for in-memory operations)
      clone.filters.push({ field: '_func', op: 'eq', value: predicate } as any);
    } else {
      clone.filters.push(predicate);
    }
    return clone as T;
  }

  // Shorthand filters
  eq(field: string, value: unknown): T { return this.filter({ field, op: 'eq', value }); }
  neq(field: string, value: unknown): T { return this.filter({ field, op: 'neq', value }); }
  contains(field: string, value: string): T { return this.filter({ field, op: 'contains', value }); }
  matches(field: string, pattern: string): T { return this.filter({ field, op: 'matches', value: pattern }); }
  in(field: string, values: unknown[]): T { return this.filter({ field, op: 'in', value: values }); }

  sort(field: string, dir: SortDirection = 'asc'): T {
    const clone = this.clone();
    clone.sortField = field;
    clone.sortDir = dir;
    return clone as T;
  }

  limit(n: number): T {
    const clone = this.clone();
    clone.limitCount = n;
    return clone as T;
  }

  offset(n: number): T {
    const clone = this.clone();
    clone.offsetCount = n;
    return clone as T;
  }

  // Terminal methods
  async toArray(): Promise<GraphNode[]> {
    const result = await this.execute();
    return this.applyPostFilters(result.nodes as GraphNode[]);
  }

  async toGraph(): Promise<GraphResult> {
    const result = await this.execute();
    return {
      nodes: this.applyPostFilters(result.nodes as GraphNode[]),
      edges: result.edges as GraphEdge[],
    };
  }

  async toMarkdown(): Promise<string> {
    const nodes = await this.toArray();
    return this.formatAsMarkdown(nodes);
  }

  async toJSON(): Promise<string> {
    const result = await this.toGraph();
    return JSON.stringify(result, null, 2);
  }

  async count(): Promise<number> {
    const nodes = await this.toArray();
    return nodes.length;
  }

  async exists(): Promise<boolean> {
    const count = await this.count();
    return count > 0;
  }

  // Abstract: each scope implements its own query translation
  protected abstract execute(): Promise<{ nodes: unknown[]; edges: unknown[] }>;
  protected abstract clone(): T;
  protected abstract formatAsMarkdown(nodes: GraphNode[]): string;

  protected applyPostFilters(nodes: GraphNode[]): GraphNode[] {
    return nodes.filter(node => {
      for (const f of this.filters) {
        if (f.field === '_func') continue; // Skip function filters for DB
        const val = (node as any)[f.field];
        if (!this.evaluateFilter(val, f)) return false;
      }
      // Apply function filters
      for (const f of this.filters) {
        if (f.field === '_func') {
          if (!(f.value as Function)(node)) return false;
        }
      }
      return true;
    });
  }

  private evaluateFilter(val: unknown, f: FilterPredicate): boolean {
    switch (f.op) {
      case 'eq': return val === f.value;
      case 'neq': return val !== f.value;
      case 'contains': return typeof val === 'string' && val.includes(f.value as string);
      case 'matches': return typeof val === 'string' && new RegExp(f.value as string).test(val);
      case 'in': return Array.isArray(f.value) && f.value.includes(val);
      case 'exists': return val !== null && val !== undefined;
      case 'gt': return typeof val === 'number' && val > (f.value as number);
      case 'gte': return typeof val === 'number' && val >= (f.value as number);
      case 'lt': return typeof val === 'number' && val < (f.value as number);
      case 'lte': return typeof val === 'number' && val <= (f.value as number);
      default: return true;
    }
  }
}

// Public API entry point
export function createQuery(store: IStore, repoPath: string): RepoScope {
  return new RepoScope(store, repoPath);
}
```

### 2.7 RepoScope (Top-Level)

```typescript
// src/query/scopes/repo-scope.ts

import { QueryScope } from '../builder';
import { IStore } from '../../storage/interface';
import { GraphNode } from '../../types/graph';
import { ModuleScope } from './module-scope';
import { FileScope } from './file-scope';
import { SymbolScope } from './symbol-scope';

export class RepoScope extends QueryScope<RepoScope> {
  protected async execute(): Promise<{ nodes: unknown[]; edges: unknown[] }> {
    const query = `
      SELECT * FROM repository 
      WHERE root = $repoPath
      LIMIT 1
    `;
    const nodes = await this.store.query(query, { repoPath: this.repoPath });
    return { nodes, edges: [] };
  }

  protected clone(): RepoScope {
    return new RepoScope(this.store, this.repoPath);
  }

  protected formatAsMarkdown(nodes: GraphNode[]): string {
    if (nodes.length === 0) return 'Repository not indexed.';
    const repo = nodes[0];
    const stats = repo.stats as any;
    return [
      `# Repository: ${repo.name}`,
      ``,
      `- **Path:** ${repo.root}`,
      `- **Files:** ${stats?.files ?? 'N/A'}`,
      `- **Modules:** ${stats?.modules ?? 'N/A'}`,
      `- **Symbols:** ${stats?.symbols ?? 'N/A'}`,
      `- **Last Indexed:** ${repo.updated_at}`,
    ].join('\n');
  }

  // Navigation to sub-scopes
  modules(): ModuleScope {
    return new ModuleScope(this.store, this.repoPath, null);
  }

  files(): FileScope {
    return new FileScope(this.store, this.repoPath, null);
  }

  symbols(): SymbolScope {
    return new SymbolScope(this.store, this.repoPath, null);
  }

  docs(): DocScope {
    return new DocScope(this.store, this.repoPath, null);
  }

  // Convenience: direct symbol lookup
  symbol(name: string): SymbolScope {
    return new SymbolScope(this.store, this.repoPath, null)
      .eq('name', name);
  }

  table(name: string): TableScope {
    return new TableScope(this.store, this.repoPath, null)
      .eq('name', name);
  }

  commit(hash: string): CommitScope {
    return new CommitScope(this.store, this.repoPath, null)
      .eq('hash', hash);
  }
}
```

### 2.8 SymbolScope (With Graph Traversal)

```typescript
// src/query/scopes/symbol-scope.ts

import { QueryScope } from '../builder';
import { IStore } from '../../storage/interface';
import { GraphNode, GraphEdge } from '../../types/graph';

export class SymbolScope extends QueryScope<SymbolScope> {
  constructor(
    store: IStore,
    repoPath: string,
    private moduleId: string | null
  ) {
    super(store, repoPath);
  }

  protected async execute(): Promise<{ nodes: unknown[]; edges: unknown[] }> {
    let query = 'SELECT * FROM symbol';
    const vars: Record<string, unknown> = {};
    const conditions: string[] = [];

    if (this.moduleId) {
      // Join through file to filter by module
      query = `
        SELECT symbol.*, file.path as file_path, file.module_id 
        FROM symbol 
        INNER JOIN file ON symbol.file_id = file.id
      `;
      conditions.push('file.module_id = $moduleId');
      vars.moduleId = this.moduleId;
    }

    // Apply filters
    for (const f of this.filters) {
      if (f.field === '_func') continue;
      const param = `f_${f.field}`;
      switch (f.op) {
        case 'eq': conditions.push(`symbol.${f.field} = $${param}`); break;
        case 'neq': conditions.push(`symbol.${f.field} != $${param}`); break;
        case 'contains': conditions.push(`string::contains(symbol.${f.field}, $${param})`); break;
        case 'matches': conditions.push(`string::matches(symbol.${f.field}, $${param})`); break;
        case 'in': conditions.push(`symbol.${f.field} IN $${param}`); break;
        case 'exists': conditions.push(`symbol.${f.field} != NONE`); break;
      }
      vars[param] = f.value;
    }

    if (conditions.length > 0) {
      query += ` WHERE ${conditions.join(' AND ')}`;
    }

    if (this.sortField) {
      query += ` ORDER BY symbol.${this.sortField} ${this.sortDir.toUpperCase()}`;
    }

    if (this.limitCount !== null) {
      query += ` LIMIT ${this.limitCount}`;
    }
    if (this.offsetCount > 0) {
      query += ` START ${this.offsetCount}`;
    }

    const nodes = await this.store.query(query, vars);
    return { nodes, edges: [] };
  }

  // Graph traversal methods
  async dependants(): Promise<SymbolScope> {
    const symbols = await this.toArray();
    if (symbols.length === 0) return this;
    
    const ids = symbols.map(s => s.id);
    const result = await this.store.graphTraversal(
      ids[0], // Start from first symbol
      ['calls', 'imports', 'references'],
      'inbound',
      10, // max depth
      undefined
    );
    
    // Return new scope with traversed nodes
    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);
    // Store pre-computed result
    (newScope as any)._precomputedNodes = result.nodes;
    (newScope as any)._precomputedEdges = result.edges;
    return newScope;
  }

  async dependencies(): Promise<SymbolScope> {
    const symbols = await this.toArray();
    if (symbols.length === 0) return this;
    
    const result = await this.store.graphTraversal(
      symbols[0].id,
      ['calls', 'imports', 'references'],
      'outbound',
      10,
      undefined
    );
    
    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);
    (newScope as any)._precomputedNodes = result.nodes;
    (newScope as any)._precomputedEdges = result.edges;
    return newScope;
  }

  async callers(): Promise<SymbolScope> {
    const symbols = await this.toArray();
    if (symbols.length === 0) return this;
    
    const result = await this.store.graphTraversal(
      symbols[0].id,
      ['calls'],
      'inbound',
      10,
      undefined
    );
    
    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);
    (newScope as any)._precomputedNodes = result.nodes;
    (newScope as any)._precomputedEdges = result.edges;
    return newScope;
  }

  async callees(): Promise<SymbolScope> {
    const symbols = await this.toArray();
    if (symbols.length === 0) return this;
    
    const result = await this.store.graphTraversal(
      symbols[0].id,
      ['calls'],
      'outbound',
      10,
      undefined
    );
    
    const newScope = new SymbolScope(this.store, this.repoPath, this.moduleId);
    (newScope as any)._precomputedNodes = result.nodes;
    (newScope as any)._precomputedEdges = result.edges;
    return newScope;
  }

  // Navigate to containing file
  async file(): Promise<FileScope> {
    const symbols = await this.toArray();
    if (symbols.length === 0) return new FileScope(this.store, this.repoPath, null);
    const fileId = (symbols[0] as any).file_id;
    const fileScope = new FileScope(this.store, this.repoPath, null);
    (fileScope as any)._precomputedFileId = fileId;
    return fileScope;
  }

  protected clone(): SymbolScope {
    return new SymbolScope(this.store, this.repoPath, this.moduleId);
  }

  protected formatAsMarkdown(nodes: GraphNode[]): string {
    if (nodes.length === 0) return 'No symbols found.';
    return nodes.map(n => {
      const s = n as any;
      const exportTag = s.is_exported ? 'exported' : 'internal';
      const location = s.file_path ? `(${s.file_path}:${s.start_line})` : `(${s.start_line})`;
      return `- **${s.name}** [${s.kind}] [${exportTag}] ${location}${s.signature ? `\n  \`${s.signature}\`` : ''}${s.docstring ? `\n  > ${s.docstring.split('\n')[0]}` : ''}`;
    }).join('\n');
  }
}
```

### 2.9 MCP Tool Implementation Example

```typescript
// src/mcp/tools/impact.ts

import { Tool } from '@modelcontextprotocol/sdk/types.js';
import { IStore } from '../../storage/interface';
import { createQuery } from '../../query/builder';
import { TokenBudgetManager } from '../token-budget';

export function createImpactAnalysisTool(store: IStore, repoPath: string, budget: TokenBudgetManager): Tool {
  return {
    name: 'get_impact_analysis',
    description: `Analyze the impact of changing a symbol. Returns all direct and transitive dependants — functions that call it, files that import it, modules that depend on it. Use this before making changes to understand blast radius.`,
    inputSchema: {
      type: 'object',
      properties: {
        symbol_name: {
          type: 'string',
          description: 'Name of the symbol to analyze',
        },
        symbol_kind: {
          type: 'string',
          enum: ['function', 'class', 'interface', 'type_alias', 'variable', 'table', 'column'],
          description: 'Kind of symbol (optional, narrows search)',
        },
        file_path: {
          type: 'string',
          description: 'File path to disambiguate (optional)',
        },
        max_depth: {
          type: 'number',
          description: 'Max traversal depth for transitive dependants (default: 5)',
          default: 5,
        },
        include_transitive: {
          type: 'boolean',
          description: 'Include transitive (indirect) dependants (default: true)',
          default: true,
        },
      },
      required: ['symbol_name'],
    },
    handler: async (params: any) => {
      const q = createQuery(store, repoPath)
        .symbol(params.symbol_name);

      if (params.symbol_kind) q.eq('kind', params.symbol_kind);
      if (params.file_path) q.eq('file_path', params.file_path);

      const symbols = await q.toArray();
      if (symbols.length === 0) {
        return {
          content: [{ type: 'text', text: JSON.stringify({ error: 'Symbol not found', symbol_name: params.symbol_name }) }],
        };
      }

      const symbol = symbols[0];
      const depth = params.max_depth ?? 5;

      // Get dependants via graph traversal
      const result = await store.graphTraversal(
        symbol.id,
        ['calls', 'imports', 'references', 'implements'],
        'inbound',
        depth,
        undefined
      );

      // Organize by distance (direct vs transitive)
      const direct = result.edges.filter(e => {
        // Direct edges are those where the target is our symbol
        return e.to === symbol.id;
      }).map(e => result.nodes.find(n => n.id === e.from)!).filter(Boolean);

      const transitive = result.nodes.filter(n => 
        n.id !== symbol.id && !direct.find(d => d.id === n.id)
      );

      // Group by file and module
      const byFile = new Map<string, GraphNode[]>();
      const byModule = new Map<string, GraphNode[]>();
      
      for (const node of result.nodes) {
        const n = node as any;
        if (n.file_path) {
          if (!byFile.has(n.file_path)) byFile.set(n.file_path, []);
          byFile.get(n.file_path)!.push(node);
        }
        if (n.module_id) {
          if (!byModule.has(n.module_id)) byModule.set(n.module_id, []);
          byModule.get(n.module_id)!.push(node);
        }
      }

      const response = {
        target: {
          id: symbol.id,
          name: (symbol as any).name,
          kind: (symbol as any).kind,
          file: (symbol as any).file_path,
          line: (symbol as any).start_line,
        },
        impact_summary: {
          total_dependants: result.nodes.length,
          direct_dependants: direct.length,
          transitive_dependants: transitive.length,
          files_affected: byFile.size,
          modules_affected: byModule.size,
        },
        direct_dependants: direct.map(n => ({
          name: (n as any).name,
          kind: (n as any).kind,
          file: (n as any).file_path,
          line: (n as any).start_line,
          relationship: result.edges.find(e => e.from === n.id && e.to === symbol.id)?.type,
        })),
        affected_files: Object.fromEntries(
          Array.from(byFile.entries()).map(([path, nodes]) => [
            path,
            nodes.map(n => ({ name: (n as any).name, kind: (n as any).kind, line: (n as any).start_line }))
          ])
        ),
        affected_modules: Object.fromEntries(
          Array.from(byModule.entries()).map(([id, nodes]) => [
            id,
            { symbol_count: nodes.length, kinds: [...new Set(nodes.map(n => (n as any).kind))] }
          ])
        ),
        token_estimate: budget.estimate(JSON.stringify(result)),
      };

      // Apply token budget truncation if needed
      const truncated = budget.truncate(response, params.max_tokens);

      return {
        content: [{ type: 'text', text: JSON.stringify(truncated, null, 2) }],
      };
    },
  };
}
```

### 2.10 Git Hook Implementation

```typescript
// src/hooks/pre-commit.ts

import { simpleGit, SimpleGit } from 'simple-git';
import { IStore } from '../storage/interface';
import { ExtractorRegistry } from '../extractor/registry';
import { GraphDiffer } from '../engine/differ';
import { GraphMerger } from '../engine/merger';
import { Validator } from '../engine/validator';
import { contentHash } from '../utils/hash';
import { Logger } from '../utils/logger';

interface PreCommitResult {
  status: 'pass' | 'warn' | 'fail';
  parsed: number;
  updated: number;
  added: number;
  removed: number;
  errors: string[];
  warnings: string[];
}

export async function runPreCommit(
  repoPath: string,
  store: IStore,
  config: { mode: 'warn' | 'block' | 'off' },
  logger: Logger
): Promise<PreCommitResult> {
  const result: PreCommitResult = {
    status: 'pass',
    parsed: 0,
    updated: 0,
    added: 0,
    removed: 0,
    errors: [],
    warnings: [],
  };

  const git: SimpleGit = simpleGit(repoPath);

  // 1. Get staged files
  const stagedFiles = await git.diff(['--cached', '--name-only', '--diff-filter=ACMR']);
  const fileNames = stagedFiles.trim().split('\n').filter(Boolean);

  if (fileNames.length === 0) {
    return result;
  }

  logger.info(`Pre-commit: ${fileNames.length} staged files`);

  // 2. Filter to supported files
  const registry = new ExtractorRegistry();
  const supportedFiles = fileNames.filter(f => registry.supportsFile(f));

  if (supportedFiles.length === 0) {
    return result;
  }

  logger.info(`Pre-commit: ${supportedFiles.length} supported files to parse`);

  // 3. Parse changed files
  for (const filePath of supportedFiles) {
    try {
      const absolutePath = path.resolve(repoPath, filePath);
      const content = await fs.readFile(absolutePath, 'utf-8');
      const hash = contentHash(content);

      // Check if content actually changed
      const existingFile = await store.query(
        'SELECT content_hash FROM file WHERE path = $path LIMIT 1',
        { path: filePath }
      );

      if (existingFile.length > 0 && existingFile[0].content_hash === hash) {
        continue; // No change
      }

      // Extract symbols
      const extractor = registry.getExtractor(filePath);
      const extraction = await extractor.extractFile(absolutePath, repoPath);

      // Diff against existing graph
      const oldSymbols = await store.query(
        'SELECT * FROM symbol WHERE file_id = $fileId',
        { fileId: `file:${filePath}` }
      );

      const diff = GraphDiffer.diff(oldSymbols, extraction.symbols);

      // Merge into graph
      await store.transaction(async (tx) => {
        // Remove old symbols
        for (const removed of diff.removed) {
          await tx.deleteNode(removed.id);
          await tx.deleteEdges(removed.id);
          result.removed++;
        }

        // Update changed symbols
        for (const changed of diff.changed) {
          await tx.updateNode(changed.new.id, changed.new);
          result.updated++;
        }

        // Add new symbols
        for (const added of diff.added) {
          await tx.createNode(added);
          result.added++;
        }

        // Update edges
        await tx.deleteEdges(`file:${filePath}`); // Remove old edges from this file
        await tx.createEdges(extraction.edges.map(e => ({
          ...e,
          // Resolve file-level edges
          from: e.from.startsWith('file:') ? `file:${filePath}` : e.from,
        })));

        // Update file node
        const fileNode = {
          id: `file:${filePath}`,
          type: 'file',
          path: filePath,
          content_hash: hash,
          parse_status: extraction.parseErrors.length === 0 ? 'parsed' : 'partial',
          parse_error: extraction.parseErrors.length > 0 
            ? extraction.parseErrors.map(e => `L${e.line}: ${e.message}`).join('; ') 
            : null,
          last_parsed: new Date().toISOString(),
          line_count: content.split('\n').length,
          size_bytes: Buffer.byteLength(content),
        };
        await tx.createNode(fileNode as any);
      });

      result.parsed++;

      if (extraction.parseErrors.length > 0) {
        result.warnings.push(
          `${filePath}: ${extraction.parseErrors.length} parse errors`
        );
      }
    } catch (err) {
      result.errors.push(`${filePath}: ${err.message}`);
      logger.error(`Pre-commit error for ${filePath}`, err);
    }
  }

  // 4. Validate (if enabled)
  if (config.mode !== 'off') {
    const validation = await Validator.validate(store, repoPath);
    result.warnings.push(...validation.warnings);
    result.errors.push(...validation.errors);

    if (result.errors.length > 0 && config.mode === 'block') {
      result.status = 'fail';
    } else if (result.warnings.length > 0 || result.errors.length > 0) {
      result.status = 'warn';
    }
  }

  // 5. Update repo stats
  await updateRepoStats(store, repoPath);

  return result;
}
```

### 2.11 Workflow Template: Bug Fix

```typescript
// src/workflows/templates/bug-fix.ts

import { IStore } from '../../storage/interface';
import { createQuery } from '../../query/builder';

export interface BugFixInput {
  error_message?: string;
  stack_trace?: string[];
  file_path?: string;
  line_number?: number;
  symbol_name?: string;
  error_type?: string; // TypeError, ReferenceError, etc.
}

export interface BugFixOutput {
  root_candidates: RootCandidate[];
  impact_radius: ImpactRadius;
  related_tests: RelatedTest[];
  recent_changes: RecentChange[];
  suggested_investigation_order: string[];
}

interface RootCandidate {
  symbol_id: string;
  symbol_name: string;
  kind: string;
  file_path: string;
  line: number;
  confidence: 'high' | 'medium' | 'low';
  reason: string;
}

interface ImpactRadius {
  direct_callers: number;
  transitive_callers: number;
  affected_files: string[];
  affected_modules: string[];
}

export async function executeBugFixWorkflow(
  store: IStore,
  repoPath: string,
  input: BugFixInput
): Promise<BugFixOutput> {
  const candidates: RootCandidate[] = [];

  // Strategy 1: If we have a file + line, look up the symbol at that location
  if (input.file_path && input.line_number) {
    const symbols = await createQuery(store, repoPath)
      .symbol('') // We need a different query here
      .eq('file_path', input.file_path)
      .toArray();

    // Find symbol containing the line
    const containing = symbols.find(s => {
      const sym = s as any;
      return sym.start_line <= input.line_number! && sym.end_line >= input.line_number!;
    });

    if (containing) {
      candidates.push({
        symbol_id: containing.id,
        symbol_name: (containing as any).name,
        kind: (containing as any).kind,
        file_path: (containing as any).file_path,
        line: (containing as any).start_line,
        confidence: 'high',
        reason: `Symbol at error location (${input.file_path}:${input.line_number})`,
      });
    }
  }

  // Strategy 2: If we have a symbol name from the error (e.g., "Cannot read property 'foo' of undefined")
  if (input.symbol_name || input.error_message) {
    const nameToSearch = input.symbol_name || extractPropertyName(input.error_message!);
    if (nameToSearch) {
      const matches = await createQuery(store, repoPath)
        .symbol(nameToSearch)
        .toArray();

      for (const match of matches) {
        // Don't duplicate if already found
        if (candidates.find(c => c.symbol_id === match.id)) continue;

        candidates.push({
          symbol_id: match.id,
          symbol_name: (match as any).name,
          kind: (match as any).kind,
          file_path: (match as any).file_path,
          line: (match as any).start_line,
          confidence: 'medium',
          reason: `Name matches error reference: "${nameToSearch}"`,
        });
      }
    }
  }

  // Strategy 3: If we have a stack trace, trace the call chain
  if (input.stack_trace && input.stack_trace.length > 0) {
    for (const frame of input.stack_trace) {
      const parsed = parseStackFrame(frame);
      if (!parsed) continue;

      const symbols = await createQuery(store, repoPath)
        .symbol(parsed.functionName)
        .eq('file_path', parsed.filePath)
        .toArray();

      for (const sym of symbols) {
        if (candidates.find(c => c.symbol_id === sym.id)) continue;
        candidates.push({
          symbol_id: sym.id,
          symbol_name: (sym as any).name,
          kind: (sym as any).kind,
          file_path: (sym as any).file_path,
          line: (sym as any).start_line,
          confidence: parsed.filePath === input.file_path ? 'high' : 'medium',
          reason: `Appears in stack trace: ${frame.trim()}`,
        });
      }
    }
  }

  // Strategy 4: If error type suggests null/undefined, find recently changed symbols in the area
  if (input.error_type && ['TypeError', 'ReferenceError'].includes(input.error_type)) {
    // Find symbols modified in last 5 commits in the same file
    if (input.file_path) {
      const recentSymbols = await store.query(`
        SELECT symbol.*, commit.hash, commit.date
        FROM symbol
        INNER JOIN modified_in ON symbol.file_id = modified_in.from
        INNER JOIN commit ON modified_in.to = commit.id
        WHERE symbol.file_path = $filePath
        ORDER BY commit.date DESC
        LIMIT 10
      `, { filePath: input.file_path });

      for (const rs of recentSymbols) {
        if (candidates.find(c => c.symbol_id === rs.id)) continue;
        candidates.push({
          symbol_id: rs.id,
          symbol_name: rs.name,
          kind: rs.kind,
          file_path: rs.file_path,
          line: rs.start_line,
          confidence: 'low',
          reason: `Recently modified symbol in error file (commit ${rs.hash})`,
        });
      }
    }
  }

  // Compute impact radius for top candidate
  let impactRadius: ImpactRadius = {
    direct_callers: 0,
    transitive_callers: 0,
    affected_files: [],
    affected_modules: [],
  };

  if (candidates.length > 0) {
    const topCandidate = candidates[0];
    const result = await store.graphTraversal(
      topCandidate.symbol_id,
      ['calls', 'imports'],
      'inbound',
      10,
      undefined
    );
    
    const directEdges = result.edges.filter(e => e.to === topCandidate.symbol_id);
    impactRadius.direct_callers = directEdges.length;
    impactRadius.transitive_callers = result.nodes.length;
    impactRadius.affected_files = [...new Set(result.nodes.map(n => (n as any).file_path).filter(Boolean))];
    
    // Resolve modules
    for (const filePath of impactRadius.affected_files) {
      const fileNode = await store.query(
        'SELECT module_id FROM file WHERE path = $path LIMIT 1',
        { path: filePath }
      );
      if (fileNode.length > 0 && fileNode[0].module_id) {
        impactRadius.affected_modules.push(fileNode[0].module_id);
      }
    }
    impactRadius.affected_modules = [...new Set(impactRadius.affected_modules)];
  }

  // Find related tests
  const relatedTests: RelatedTest[] = [];
  if (candidates.length > 0) {
    for (const candidate of candidates.slice(0, 3)) {
      const testSymbols = await store.query(`
        SELECT * FROM symbol
        WHERE name CONTAINS $testName
          AND (kind = 'function' AND name LIKE '%test%')
        LIMIT 5
      `, { testName: candidate.symbol_name });

      for (const test of testSymbols) {
        relatedTests.push({
          test_name: test.name,
          file_path: test.file_path,
          line: test.start_line,
          linked_to: candidate.symbol_name,
        });
      }
    }
  }

  // Suggest investigation order
  const suggestedOrder = candidates
    .sort((a, b) => {
      const confOrder = { high: 0, medium: 1, low: 2 };
      return confOrder[a.confidence] - confOrder[b.confidence];
    })
    .map(c => `${c.file_path}:${c.line} (${c.symbol_name})`);

  return {
    root_candidates: candidates,
    impact_radius: impactRadius,
    related_tests: relatedTests,
    recent_changes: [], // Populated from git log
    suggested_investigation_order: suggestedOrder,
  };
}

function extractPropertyName(errorMessage: string): string | null {
  // "Cannot read properties of undefined (reading 'foo')"
  const readMatch = errorMessage.match(/reading '(\w+)'/);
  if (readMatch) return readMatch[1];
  
  // "foo is not a function"
  const notFnMatch = errorMessage.match(/(\w+) is not a function/);
  if (notFnMatch) return notFnMatch[1];
  
  // "foo is not defined"
  const notDefMatch = errorMessage.match(/(\w+) is not defined/);
  if (notDefMatch) return notDefMatch[1];

  return null;
}

function parseStackFrame(frame: string): { functionName: string; filePath: string } | null {
  // "at functionName (/path/to/file.ts:10:5)"
  const match = frame.match(/at\s+(\w+)\s+\((.+):(\d+):\d+\)/);
  if (!match) return null;
  return { functionName: match[1], filePath: match[2] };
}
```

### 2.12 Token Budget Manager

```typescript
// src/mcp/token-budget.ts

export class TokenBudgetManager {
  private maxTokens: number;

  // Approximate tokens per character for different content types
  private static RATES = {
    code: 0.25,       // ~4 chars per token
    markdown: 0.3,    // ~3.3 chars per token
    json: 0.22,       // ~4.5 chars per token (compact)
    text: 0.33,       // ~3 chars per token
  };

  constructor(maxTokens: number = 8000) {
    this.maxTokens = maxTokens;
  }

  estimate(content: string, type: keyof typeof TokenBudgetManager.RATES = 'json'): number {
    return Math.ceil(content.length * TokenBudgetManager.RATES[type]);
  }

  truncate<T>(data: T, requestedMax?: number): T & { _truncated: boolean; _token_count: number } {
    const max = requestedMax ?? this.maxTokens;
    const json = JSON.stringify(data);
    const tokens = this.estimate(json);

    if (tokens <= max) {
      return {
        ...data,
        _truncated: false,
        _token_count: tokens,
      } as T & { _truncated: boolean; _token_count: number };
    }

    // Truncation strategy: keep structure, reduce detail
    const truncated = this.smartTruncate(data, max);
    const truncatedJson = JSON.stringify(truncated);
    const truncatedTokens = this.estimate(truncatedJson);

    return {
      ...truncated,
      _truncated: true,
      _token_count: truncatedTokens,
    } as T & { _truncated: boolean; _token_count: number };
  }

  private smartTruncate<T>(data: T, budget: number): T {
    const obj = data as any;

    // Strategy 1: If it has an array of items, truncate the array
    for (const key of Object.keys(obj)) {
      if (Array.isArray(obj[key]) && obj[key].length > 0) {
        // Keep reducing until we're under budget
        let len = obj[key].length;
        while (len > 1) {
          const testObj = { ...obj, [key]: obj[key].slice(0, len) };
          const testJson = JSON.stringify(testObj);
          if (this.estimate(testJson) <= budget * 0.9) { // 10% margin for metadata
            obj[key] = obj[key].slice(0, len);
            obj._truncation_note = `${key} truncated from ${obj[key].length} to ${len} items`;
            return obj as T;
          }
          len = Math.floor(len * 0.7); // Reduce by 30% each iteration
        }
        obj[key] = obj[key].slice(0, 1);
        return obj as T;
      }
    }

    // Strategy 2: Remove verbose fields
    const verboseFields = ['signature', 'docstring', 'metadata', 'raw'];
    for (const field of verboseFields) {
      if (obj[field]) {
        delete obj[field];
        const testJson = JSON.stringify(obj);
        if (this.estimate(testJson) <= budget * 0.9) {
          return obj as T;
        }
      }
    }

    // Strategy 3: Last resort - truncate string fields
    for (const key of Object.keys(obj)) {
      if (typeof obj[key] === 'string' && obj[key].length > 100) {
        obj[key] = obj[key].slice(0, 100) + '...';
      }
    }

    return obj as T;
  }
}
```

### 2.13 SurrealDB Schema Migration

```typescript
// src/storage/surreal/migrations.ts

export const SCHEMA_DEFINITION = `
// ============================================
// TOKENZIP GRAPH SCHEMA - SurrealDB v2
// ============================================

// --- NODE TYPES ---

DEFINE TABLE repository SCHEMAFULL;
DEFINE FIELD name ON repository TYPE string;
DEFINE FIELD root ON repository TYPE string;
DEFINE FIELD created_at ON repository TYPE datetime DEFAULT time::now();
DEFINE FIELD updated_at ON repository TYPE datetime DEFAULT time::now();
DEFINE FIELD stats ON repository TYPE object {
  files: number,
  modules: number, 
  symbols: number
};

DEFINE TABLE module SCHEMAFULL;
DEFINE FIELD name ON module TYPE string;
DEFINE FIELD path ON module TYPE string;
DEFINE FIELD manifest_type ON module TYPE string;
DEFINE FIELD language ON module TYPE string;
DEFINE FIELD is_root ON module TYPE bool DEFAULT false;
DEFINE FIELD metadata ON module TYPE object;
DEFINE FIELD repository_id ON module TYPE record<repository>;

DEFINE TABLE file SCHEMAFULL;
DEFINE FIELD path ON file TYPE string;
DEFINE FIELD module_id ON file TYPE record<module>;
DEFINE FIELD language ON file TYPE string;
DEFINE FIELD ext ON file TYPE string;
DEFINE FIELD size_bytes ON file TYPE int;
DEFINE FIELD content_hash ON file TYPE string;
DEFINE FIELD line_count ON file TYPE int;
DEFINE FIELD parse_status ON file TYPE string 
  ASSERT $value IN ['parsed', 'partial', 'failed', 'skipped'];
DEFINE FIELD parse_error ON file TYPE option<string>;
DEFINE FIELD last_parsed ON file TYPE datetime;
DEFINE FIELD git_last_modified ON file TYPE option<datetime>;
DEFINE FIELD git_blame_summary ON file TYPE option<object>;

DEFINE TABLE symbol SCHEMAFULL;
DEFINE FIELD file_id ON symbol TYPE record<file>;
DEFINE FIELD name ON symbol TYPE string;
DEFINE FIELD kind ON symbol TYPE string 
  ASSERT $value IN [
    'function', 'method', 'constructor',
    'class', 'interface', 'type_alias', 'enum',
    'variable', 'constant', 'property',
    'parameter', 'generic_param',
    'decorator', 'annotation',
    'table', 'view', 'column', 'index', 'constraint',
    'foreign_key', 'stored_procedure',
    'import', 'export', 're_export',
    'namespace', 'module_decl',
    'section', 'subsection',
    'workflow_step', 'diagram_node',
    'list_item', 'table_row'
  ];
DEFINE FIELD signature ON symbol TYPE option<string>;
DEFINE FIELD return_type ON symbol TYPE option<string>;
DEFINE FIELD start_line ON symbol TYPE int;
DEFINE FIELD end_line ON symbol TYPE int;
DEFINE FIELD start_col ON symbol TYPE int;
DEFINE FIELD end_col ON symbol TYPE int;
DEFINE FIELD docstring ON symbol TYPE option<string>;
DEFINE FIELD is_exported ON symbol TYPE bool DEFAULT false;
DEFINE FIELD is_async ON symbol TYPE option<bool>;
DEFINE FIELD is_static ON symbol TYPE option<bool>;
DEFINE FIELD visibility ON symbol TYPE option<string>
  ASSERT $value IN [null, 'public', 'private', 'protected'];
DEFINE FIELD modifiers ON symbol TYPE array;
DEFINE FIELD parent_symbol_id ON symbol TYPE option<string>;
DEFINE FIELD metadata ON symbol TYPE object;

DEFINE TABLE commit SCHEMAFULL;
DEFINE FIELD hash ON commit TYPE string;
DEFINE FIELD short_hash ON commit TYPE string;
DEFINE FIELD message ON commit TYPE string;
DEFINE FIELD author ON commit TYPE string;
DEFINE FIELD email ON commit TYPE string;
DEFINE FIELD date ON commit TYPE datetime;
DEFINE FIELD branch ON commit TYPE string;
DEFINE FIELD tags ON commit TYPE array;

DEFINE TABLE dependency SCHEMAFULL;
DEFINE FIELD module_id ON dependency TYPE record<module>;
DEFINE FIELD name ON dependency TYPE string;
DEFINE FIELD version ON dependency TYPE string;
DEFINE FIELD dev ON dependency TYPE bool DEFAULT false;
DEFINE FIELD source ON dependency TYPE string;

// --- EDGE TYPES ---

DEFINE TABLE contains SCHEMAFULL TYPE RELATION FROM repository, module, file, symbol TO module, file, symbol;
DEFINE TABLE imports SCHEMAFULL TYPE RELATION FROM file, symbol, module TO file, symbol, module;
DEFINE FIELD is_type_only ON imports TYPE option<bool>;
DEFINE FIELD is_default ON imports TYPE option<bool>;
DEFINE FIELD alias ON imports TYPE option<string>;
DEFINE FIELD specifiers ON imports TYPE option<array>;

DEFINE TABLE exports SCHEMAFULL TYPE RELATION FROM file, symbol TO symbol, file;
DEFINE FIELD is_default ON exports TYPE option<bool>;
DEFINE FIELD is_reexport ON exports TYPE option<bool>;
DEFINE FIELD alias ON exports TYPE option<string>;
DEFINE FIELD name ON exports TYPE option<string>;

DEFINE TABLE calls SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD line ON calls TYPE option<int>;
DEFINE FIELD is_async ON calls TYPE option<bool>;
DEFINE FIELD call_type ON calls TYPE option<string>
  ASSERT $value IN [null, 'direct', 'indirect', 'dynamic'];

DEFINE TABLE implements SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD is_partial ON implements TYPE option<bool>;

DEFINE TABLE inherits SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD is_interface_inheritance ON inherits TYPE option<bool>;

DEFINE TABLE modifies SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE TABLE reads SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE TABLE references SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD context ON references TYPE option<string>;

DEFINE TABLE depends_on SCHEMAFULL TYPE RELATION FROM module, file TO module, file;
DEFINE FIELD is_transitive ON depends_on TYPE option<bool>;
DEFINE FIELD depth ON depends_on TYPE option<int>;

DEFINE TABLE modified_in SCHEMAFULL TYPE RELATION FROM file TO commit;
DEFINE FIELD change_type ON modified_in TYPE string
  ASSERT $value IN ['added', 'modified', 'deleted', 'renamed'];

DEFINE TABLE foreign_key SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD constraint_name ON foreign_key TYPE option<string>;
DEFINE FIELD on_delete ON foreign_key TYPE option<string>;
DEFINE FIELD on_update ON foreign_key TYPE option<string>;
DEFINE FIELD ref_column ON foreign_key TYPE option<string>;

DEFINE TABLE column_of SCHEMAFULL TYPE RELATION FROM symbol TO symbol;

DEFINE TABLE diagram_edge SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD label ON diagram_edge TYPE option<string>;
DEFINE FIELD style ON diagram_edge TYPE option<string>;
DEFINE FIELD type ON diagram_edge TYPE option<string>;
DEFINE FIELD sequence ON diagram_edge TYPE option<int>;
DEFINE FIELD is_response ON diagram_edge TYPE option<bool>;

DEFINE TABLE workflow_transition SCHEMAFULL TYPE RELATION FROM symbol TO symbol;
DEFINE FIELD condition ON workflow_transition TYPE option<string>;
DEFINE FIELD action ON workflow_transition TYPE option<string>;

// --- INDEXES ---

DEFINE INDEX idx_file_path ON file FIELDS path UNIQUE;
DEFINE INDEX idx_file_hash ON file FIELDS content_hash;
DEFINE INDEX idx_file_module ON file FIELDS module_id;
DEFINE INDEX idx_symbol_name ON symbol FIELDS name;
DEFINE INDEX idx_symbol_kind ON symbol FIELDS kind;
DEFINE INDEX idx_symbol_file ON symbol FIELDS file_id;
DEFINE INDEX idx_symbol_export ON symbol FIELDS is_exported;
DEFINE INDEX idx_module_path ON module FIELDS path UNIQUE;
DEFINE INDEX idx_commit_hash ON commit FIELDS hash UNIQUE;
DEFINE INDEX idx_dep_name ON dependency FIELDS name, module_id;
`;
```

### 2.14 Error Handling Strategy

```typescript
// src/utils/errors.ts

export class TokenZipError extends Error {
  constructor(
    message: string,
    public readonly code: ErrorCode,
    public readonly details?: Record<string, unknown>
  ) {
    super(message);
    this.name = 'TokenZipError';
  }
}

export enum ErrorCode {
  // Storage errors (1xxx)
  DB_CONNECTION_FAILED = 'E1001',
  DB_QUERY_FAILED = 'E1002',
  DB_MIGRATION_FAILED = 'E1003',
  DB_CORRUPTED = 'E1004',

  // Parser errors (2xxx)
  PARSE_FAILED = 'E2001',
  GRAMMAR_NOT_FOUND = 'E2002',
  PARTIAL_PARSE = 'E2003',

  // Git errors (3xxx)
  GIT_NOT_REPOSITORY = 'E3001',
  GIT_HOOK_INSTALL_FAILED = 'E3002',
  GIT_DIFF_FAILED = 'E3003',

  // MCP errors (4xxx)
  MCP_TRANSPORT_FAILED = 'E4001',
  MCP_TOOL_NOT_FOUND = 'E4002',
  MCP_INVALID_PARAMS = 'E4003',
  MCP_TOKEN_BUDGET_EXCEEDED = 'E4004',

  // Config errors (5xxx)
  CONFIG_NOT_FOUND = 'E5001',
  CONFIG_INVALID = 'E5002',

  // Indexer errors (6xxx)
  INDEX_INTERRUPTED = 'E6001',
  INDEX_FILE_TOO_LARGE = 'E6002',
  INDEX_BINARY_FILE = 'E6003',
}

// Global error handler for MCP tools
export function mcpErrorHandler(error: unknown): { content: Array<{ type: 'text'; text: string }>; isError: boolean } {
  if (error instanceof TokenZipError) {
    return {
      content: [{
        type: 'text',
        text: JSON.stringify({
          error: error.message,
          code: error.code,
          details: error.details,
        }),
      }],
      isError: true,
    };
  }

  if (error instanceof Error) {
    return {
      content: [{
        type: 'text',
        text: JSON.stringify({
          error: error.message,
          code: 'E9999',
          stack: process.env.NODE_ENV === 'development' ? error.stack : undefined,
        }),
      }],
      isError: true,
    };
  }

  return {
    content: [{ type: 'text', text: JSON.stringify({ error: 'Unknown error' }) }],
    isError: true,
  };
}
```

### 2.15 Testing Strategy

```typescript
// tests/unit/extractor/typescript.test.ts

import { describe, it, expect, beforeEach } from 'vitest';
import { TypeScriptExtractor } from '../../../src/extractor/code/typescript';
import { createMockContext } from '../../helpers';

describe('TypeScriptExtractor', () => {
  let extractor: TypeScriptExtractor;

  beforeEach(() => {
    extractor = new TypeScriptExtractor();
  });

  describe('function extraction', () => {
    it('extracts a simple exported function', () => {
      const code = `
export function addUser(name: string, age: number): User {
  return { name, age, id: crypto.randomUUID() };
}
`;
      const ctx = createMockContext('src/user.ts', code, 'module-1');
      const result = extractor.extract(ctx);

      expect(result.symbols).toHaveLength(1);
      expect(result.symbols[0]).toMatchObject({
        name: 'addUser',
        kind: 'function',
        isExported: true,
        isAsync: false,
        startLine: 2,
        endLine: 4,
      });
      expect(result.symbols[0].metadata.params).toEqual([
        { name: 'name', type: 'string' },
        { name: 'age', type: 'number' },
      ]);
      expect(result.symbols[0].returnType).toBe('User');
    });

    it('extracts async arrow function assigned to const', () => {
      const code = `
export const fetchUser = async (id: string): Promise<User> => {
  const res = await fetch(\`/api/users/\${id}\`);
  return res.json();
};
`;
      const ctx = createMockContext('src/api.ts', code, 'module-1');
      const result = extractor.extract(ctx);

      expect(result.symbols).toHaveLength(1);
      expect(result.symbols[0]).toMatchObject({
        name: 'fetchUser',
        kind: 'function',
        isExported: true,
        isAsync: true,
      });
      expect(result.symbols[0].metadata.isArrow).toBe(true);
    });

    it('extracts class with methods, inheritance, and implementation', () => {
      const code = `
export class UserRepository implements IRepository<User> {
  private cache: Map<string, User> = new Map();

  async findById(id: string): Promise<User | null> {
    return this.cache.get(id) ?? null;
  }

  async save(user: User): Promise<void> {
    this.cache.set(user.id, user);
  }
}
`;
      const ctx = createMockContext('src/repo.ts', code, 'module-1');
      const result = extractor.extract(ctx);

      // 1 class + 1 property + 2 methods
      expect(result.symbols).toHaveLength(4);
      
      const classSym = result.symbols.find(s => s.kind === 'class')!;
      expect(classSym.name).toBe('UserRepository');
      expect(classSym.isExported).toBe(true);
      expect(classSym.metadata.implements).toEqual(['IRepository<User>']);

      const methods = result.symbols.filter(s => s.kind === 'method');
      expect(methods).toHaveLength(2);
      expect(methods.map(m => m.name)).toEqual(['findById', 'save']);

      // Check implements edge
      const implEdge = result.edges.find(e => e.type === 'implements');
      expect(implEdge).toBeDefined();
    });

    it('extracts interface with generics and members', () => {
      const code = `
export interface IRepository<T extends { id: string }> {
  findById(id: string): Promise<T | null>;
  save(entity: T): Promise<void>;
  delete(id: string): Promise<boolean>;
}
`;
      const ctx = createMockContext('src/types.ts', code, 'module-1');
      const result = extractor.extract(ctx);

      expect(result.symbols).toHaveLength(1);
      expect(result.symbols[0]).toMatchObject({
        name: 'IRepository',
        kind: 'interface',
        isExported: true,
      });
      expect(result.symbols[0].metadata.generics).toEqual(['T extends { id: string }']);
      expect(result.symbols[0].metadata.members).toHaveLength(3);
    });

    it('extracts imports with type-only and default', () => {
      const code = `
import type { User } from './types';
import React, { useState, useEffect } from 'react';
import { formatDate } from './utils';
`;
      const ctx = createMockContext('src/component.tsx', code, 'module-1');
      const result = extractor.extract(ctx);

      const imports = result.symbols.filter(s => s.kind === 'import');
      expect(imports).toHaveLength(3);
      
      expect(imports[0].metadata.isTypeOnly).toBe(true);
      expect(imports[0].metadata.source).toBe('./types');
      
      expect(imports[1].metadata.isDefault).toBe(true);
      expect(imports[1].metadata.source).toBe('react');
      expect(imports[1].metadata.specifiers).toContain('useState');
    });

    it('handles parse errors gracefully', () => {
      const code = `
export function broken(
  // Missing closing paren and body
`;
      const ctx = createMockContext('src/broken.ts', code, 'module-1');
      const result = extractor.extract(ctx);

      expect(result.parseErrors.length).toBeGreaterThan(0);
      // Should still return partial results if any
      expect(result.symbols).toBeDefined();
    });
  });
});
```

```typescript
// tests/integration/full-parse.test.ts

import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { MemoryStore } from '../../src/storage/memory/store';
import { Indexer } from '../../src/engine/indexer';
import { createQuery } from '../../src/query/builder';
import path from 'path';

describe('Full Parse Integration', () => {
  let store: MemoryStore;
  let indexer: Indexer;
  const fixturePath = path.join(__dirname, '../fixtures/ts-monorepo');

  beforeAll(async () => {
    store = new MemoryStore();
    await store.initialize();
    await store.migrate();
    indexer = new Indexer(store, fixturePath);
    await indexer.fullIndex();
  });

  afterAll(async () => {
    await store.close();
  });

  it('indexes all modules in the monorepo', async () => {
    const modules = await createQuery(store, fixturePath).modules().toArray();
    expect(modules.length).toBeGreaterThanOrEqual(3); // apps/web, apps/api, packages/shared
  });

  it('extracts all TypeScript symbols', async () => {
    const symbols = await createQuery(store, fixturePath)
      .symbols()
      .eq('kind', 'function')
      .toArray();
    expect(symbols.length).toBeGreaterThan(10);
  });

  it('resolves cross-module imports', async () => {
    // Find a symbol in packages/shared that's imported by apps/web
    const sharedExports = await createQuery(store, fixturePath)
      .modules()
      .eq('path', 'packages/shared')
      .files()
      .symbols()
      .eq('is_exported', true)
      .toArray();

    expect(sharedExports.length).toBeGreaterThan(0);

    // Check that at least one has an imports edge from apps/web
    const importEdges = await store.getEdgesTo(sharedExports[0].id, 'imports');
    // At least the file-level import should exist
  });

  it('chainable query: modules → files → symbols → filters', async () => {
    const result = await createQuery(store, fixturePath)
      .modules()
      .eq('language', 'typescript')
      .files()
      .eq('ext', '.ts')
      .symbols()
      .eq('kind', 'class')
      .eq('is_exported', true)
      .toArray();

    expect(result.length).toBeGreaterThan(0);
    for (const sym of result) {
      expect((sym as any).kind).toBe('class');
      expect((sym as any).is_exported).toBe(true);
    }
  });

  it('graph traversal: find all callers of an exported function', async () => {
    const targetFunc = await createQuery(store, fixturePath)
      .symbol('formatDate')
      .eq('kind', 'function')
      .toArray();

    if (targetFunc.length === 0) return; // Skip if fixture doesn't have this

    const callers = await createQuery(store, fixturePath)
      .symbol('formatDate')
      .callers()
      .toArray();

    // Should find at least one caller
    expect(callers.length).toBeGreaterThan(0);
  });

  it('formats query result as markdown', async () => {
    const md = await createQuery(store, fixturePath)
      .modules()
      .limit(3)
      .toMarkdown();

    expect(md).toContain('#');
    expect(md).toContain('packages/shared'); // Based on fixture
  });
});
```

### 2.16 Configuration Schema

```typescript
// src/types/config.ts

export interface TokenZipConfig {
  // Project-level config (.tokenzip/config.json)
  version: string;
  
  storage: {
    engine: 'surrealdb' | 'sqlite' | 'auto';
    path: string; // relative to project root, default: .tokenzip/db
    surrealdb?: {
      binary_path?: string; // custom surrealdb binary
      memory?: boolean; // use memory backend instead of RocksDB
    };
  };

  languages: {
    enabled: string[]; // ['typescript', 'javascript', 'python', 'sql', 'markdown']
    disabled: string[];
    custom: Record<string, {
      extensions: string[];
      grammar_path?: string; // path to custom tree-sitter WASM
      extractor_path?: string; // path to custom extractor JS
    }>;
  };

  exclude: {
    paths: string[]; // glob patterns: ['**/node_modules/**', '**/dist/**', '**/.git/**']
    files: string[]; // exact filenames: ['package-lock.json', 'yarn.lock']
    max_file_size_kb: number; // default: 500
  };

  hooks: {
    pre_commit: 'warn' | 'block' | 'off';
    post_commit: 'on' | 'off';
    validate_on_commit: boolean; // run reference integrity checks
  };

  mcp: {
    max_tokens: number; // default: 8000
    transport: 'stdio' | 'sse';
    port: number; // for SSE, default: 3777
    include_source: boolean; // include source code in responses
    source_max_lines: number; // max lines of source per symbol, default: 50
  };

  indexing: {
    worker_threads: number; // default: os.cpus().length - 1, min 1
    batch_size: number; // files per batch, default: 100
    git_history_depth: number; // commits to index, default: 100
  };

  workflows: {
    enabled: string[]; // ['create-module', 'update-module', 'implement-feature', 'upgrade-feature', 'bug-fix']
  };
}

export const DEFAULT_CONFIG: TokenZipConfig = {
  version: '2.0.0',
  storage: {
    engine: 'auto',
    path: '.tokenzip/db',
  },
  languages: {
    enabled: ['typescript', 'javascript', 'python', 'sql', 'go', 'rust', 'java', 'kotlin', 'markdown'],
    disabled: [],
    custom: {},
  },
  exclude: {
    paths: [
      '**/node_modules/**',
      '**
