{"slug": "show-hn-crespo-tree-sitter-ast-blueprints-instead-of-raw-code-for-llms", "title": "Show HN: Crespo – Tree-sitter AST blueprints instead of raw code for LLMs", "summary": "Crespo, a new open-source tool, uses Tree-sitter AST parsing to extract structural blueprints from codebases, reducing token usage by up to 89% while preserving architectural understanding for LLMs. The tool supports 10 programming languages and offers structure, summary, and concat modes for different use cases. In tests on real repositories, LLMs answered architectural questions correctly 2.75 out of 3 times on average using only the compressed blueprints.", "body_md": "*Stop burning your context window on raw source files.Crespo extracts what matters — and compresses everything else.*\n\n```\npip install crespo && crespo ./myproject\n```\n\nYou paste a codebase into any LLM. It hits the context limit but you problem remains unfinished. You paste one file at a time. The AI loses the big picture. You give AI the full code. The AI reads 40,000 tokens linearly and still misses the architecture.\n\n**Crespo solves this differently.**\n\nInstead of concatenating raw files, Crespo uses **Tree-sitter AST parsing** to extract only the structural DNA of your repository — imports, classes, functions, module connections — and emits a compact XML blueprint. Same architectural understanding. A fraction of the tokens.\n\n```\n# Structure mode (default)\ncrespo ./myproject\n\n# Summary mode — requires Groq key\ncrespo ./myproject --mode summary --groq YOUR_KEY\n\n# Save your Groq key for future runs\ncrespo --groq YOUR_KEY\n\n# Concat mode — full source, redacted\ncrespo ./myproject --mode concat\n\n# Analyse a GitHub repo directly\ncrespo --git https://github.com/user/repo\n\n# Custom output filename\ncrespo ./myproject --output blueprint.xml\nyour repo\n    │\n    ▼\n walker          respects .gitignore · skips tests · skips build artifacts\n    │\n    ▼\n tree-sitter     real AST parse · 10 languages · no regex\n    │\n    ▼\n extractor       imports · classes · functions · structs · enums\n    │\n    ▼\n blueprint XML   compact · structured · LLM-ready\n```\n\nNo heuristics. No regex scraping. Real language grammars via Tree-sitter — the same parser used by GitHub, Neovim, and Zed.\n\n| Mode | What it produces | Best for |\n|---|---|---|\n`structure` |\nAST skeleton — imports, classes, functions | Architecture review, onboarding, LLM context |\n`summary` |\nStructure + AI one-line descriptions per file | Deeper codebase understanding |\n`concat` |\nFull source, secrets redacted, in structured XML | Passing entire repos to LLMs safely |\n\nPython · JavaScript · TypeScript · JSX · TSX · Rust · Go · Java · C · C++\n\n```\n<?xml version='1.0' encoding='utf-8'?>\n<repo n=\"kara\" s=\"Gesture-controlled PDF viewer using PyQt6, MediaPipe, and OpenCV.\">\n  <meta>\n    <dep>cv2,mediapipe,numpy,PyQt6,fitz,groq,markdown</dep>\n  </meta>\n  <files>\n    <f p=\"Ui.py\" e=\".py\" s=\"Main PyQt6 window coordinating PDF rendering, gesture input, and AI summarisation.\">\n      <imp>PyQt6,fitz,render,summarise,gesture,markdown</imp>\n      <cls n=\"Window\">\n        <fn n=\"summary\" p=\"(self)\" />\n        <fn n=\"startGest\" p=\"(self, state)\" />\n        <fn n=\"gestZoom\" p=\"(self, state: int)\" />\n      </cls>\n    </f>\n    <f p=\"gesture.py\" e=\".py\" s=\"MediaPipe hand tracking with gesture classification and debouncing.\">\n      <imp>mediapipe,cv2,numpy</imp>\n      <cls n=\"GestureController\">\n        <fn n=\"detect\" p=\"(self, frame)\" />\n        <fn n=\"classify\" p=\"(self, landmarks)\" />\n      </cls>\n    </f>\n  </files>\n</repo>\n```\n\nTested on real open-source repositories. Structure accuracy evaluated by asking an LLM three questions from the blueprint alone — no access to the original source.\n\n| Repo | Components & connections | Dependencies | Entry point | Score |\n|---|---|---|---|---|\n| Axios | ✅ correct and specific | ✅ correct and specific | ✅ correct and specific | 3/3 |\n| Express | ✅ correct and specific | ✅ correct and specific | ✅ correct and specific | 3/3 |\n| Kara | ✅ correct and specific | ✅ correct and specific | ✅ correct and specific | 3/3 |\n| Moodilist | ✅ correct and specific | ✅ correct and specific | ✅ correct and specific | 3/3 |\n| Requests | ✅ correct and specific | ✅ correct and specific | 2/3 | |\n| Urai | ✅ correct and specific | ✅ correct and specific | 2/3 | |\n| FastAPI | ✅ correct and specific | ✅ correct and specific | 2/3 | |\n| Flask | ✅ correct and specific | ✅ correct and specific | 2/3 | |\nAverage |\n2.75 / 3 |\n\n| Repo | Raw tokens | Blueprint tokens | Reduction |\n|---|---|---|---|\n| Kara | ~4,667 | ~934 | ~80% |\n| Moodilist | ~8,580 | ~1,396 | ~84% |\n| Axios | ~61,494 | ~6,989 | ~89% |\n| Express | ~17,222 | ~707 | ~96% |\n| FastAPI | ~145,606 | ~124,993 | ~14% |\n| Flask | ~77,402 | ~13,848 | ~82% |\n| Requests | ~49,556 | ~9,585 | ~81% |\n| Urai | ~ 17,418 | ~2,304 | ~87% |\nAverage |\n~86% |\n\nFastAPI (14% reduction) excluded from average — as a framework repo its structure IS the content. Crespo correctly preserves it rather than discarding it.\n\nFramework-heavy repos compress slightly less because the preserved structure is genuinely useful — there is less noise to discard.\n\nToken Counting is done using `tiktoken`\n\npython library.\n\n**Compression Depends on Repo Type**\n\nCrespo redacts secrets before writing any output. Patterns covered:\n\n- Quoted assignments —\n`api_key = \"...\"`\n\n,`token: '...'`\n\n- Raw\n`.env`\n\nstyle —`GROQ_KEY=abc123`\n\n- Known key prefixes — Groq (\n`gsk_`\n\n), OpenAI (`sk-`\n\n), Anthropic (`sk-ant-`\n\n), GitHub (`ghp_`\n\n), AWS (`AKIA`\n\n), Slack (`xox`\n\n)\n\nSummary mode uses [Groq](https://console.groq.com) to generate one-line descriptions per file and function. The free tier is more than enough.\n\n```\n# pass once — saved to ~/.crespo/config\ncrespo --groq YOUR_KEY\n\n# all future summary runs pick it up automatically\ncrespo ./myproject --mode summary\n```\n\nYour key is stored locally at `~/.crespo/config`\n\nand never sent anywhere except Groq's API.\n\n- Something for Humans coming soon!\n- More aggressive compression preset\n- More language support (Ruby, PHP, Swift, Kotlin)\n`.crespoignore`\n\nsupport\n\nThis usually means Crespo was installed successfully, but the executable isn't on your system `PATH`\n\n.\n\n**Verify installation**\n\n```\npython -m pip show crespo\n```\n\nIf Crespo appears in the output but the `crespo`\n\ncommand still isn't recognized, it's a `PATH`\n\nissue — follow the steps below for your OS.\n\n**Windows**\n\nAdd your Python `Scripts`\n\ndirectory to `PATH`\n\n(typically):\n\n```\nC:\\Users\\<you>\\AppData\\Local\\Programs\\Python\\Python3x\\Scripts\n```\n\nRestart your terminal afterwards.\n\n**macOS / Linux**\n\nFind your user scripts directory:\n\n```\npython3 -m site --user-base\n```\n\nThen add its `bin`\n\nfolder to your shell configuration:\n\n```\nexport PATH=\"$HOME/.local/bin:$PATH\"\n```\n\nRestart your shell and try again.\n\n**Multiple Python installations**\n\nIf you have multiple Python versions installed, make sure installation and execution use the same interpreter. Check which `python`\n\n/`pip`\n\nyou're actually using:\n\n```\nwhere python      # Windows\nwhich -a python3  # macOS / Linux\n```\n\nReinstall using that same interpreter explicitly if needed:\n\n```\npython -m pip install --force-reinstall crespo\n```\n\n**Recommended: use pipx**\n\nFor CLI tools, `pipx`\n\navoids most PATH-related issues entirely:\n\n```\npipx install crespo\ncrespo ./myproject\n```\n\nContributions are welcome. If you have ideas for new output modes, better parsing, or additional language support, open an issue or PR.\n\nMIT © [Hrudul Krishna K V](https://github.com/hrudulmmn)", "url": "https://wpnews.pro/news/show-hn-crespo-tree-sitter-ast-blueprints-instead-of-raw-code-for-llms", "canonical_source": "https://github.com/hrudulmmn/crespo", "published_at": "2026-06-22 03:28:11+00:00", "updated_at": "2026-06-22 03:40:47.455423+00:00", "lang": "en", "topics": ["developer-tools", "large-language-models", "ai-tools"], "entities": ["Crespo", "Tree-sitter", "Groq", "GitHub", "Neovim", "Zed", "Axios", "Express"], "alternates": {"html": "https://wpnews.pro/news/show-hn-crespo-tree-sitter-ast-blueprints-instead-of-raw-code-for-llms", "markdown": "https://wpnews.pro/news/show-hn-crespo-tree-sitter-ast-blueprints-instead-of-raw-code-for-llms.md", "text": "https://wpnews.pro/news/show-hn-crespo-tree-sitter-ast-blueprints-instead-of-raw-code-for-llms.txt", "jsonld": "https://wpnews.pro/news/show-hn-crespo-tree-sitter-ast-blueprints-instead-of-raw-code-for-llms.jsonld"}}