How I Built a Zero-Dependency Token Compressor for AI Coding Agents (During My High School Exams) A high school student in Italy built TITAN (Token Intelligence Through Agent Narrowing), a zero-dependency CLI framework that compresses AI coding agent token consumption by 70% to 85% without degrading reasoning quality. The framework uses three orthogonal compression layers including a 'Caveman Engine' for telegraphese output, a 'Ponytail' system for minimal solutions, and context compression for system prompts. TITAN is open source and available on npm. as developers, we are spending more and more time working alongside AI coding agents like Cursor , Claude Code , GitHub Copilot , Windsurf , or Cline . But as your session grows, you quickly run into two major problems: To solve this, I built TITAN Token Intelligence Through Agent Narrowing : a universal, zero-dependency CLI framework designed to compress AI agent token consumption by 70% to 85% without degrading reasoning quality. And to make things interesting, I wrote and shipped it this week entirely on my own, right in the middle of my high school final exams la maturità here in Italy . Here is how it works under the hood. TITAN approaches token optimization not as a single post-processing step, but as three orthogonal, multiplicative layers: Total Savings = 1 - 1 - L1 Savings 1 - L2 Savings 1 - L3 Savings Instead of letting the LLM output standard verbose English prose pleasantries, hedging, filler words, technical narrations , the Caveman Engine instructs the model to use a dense, telegraphese grammar: basically , actually , likely , probably $\to$ removed. the , a , an $\to$ removed when safe . "Component re-renders" instead of "The component is re-rendering" .Before the agent writes a single line of code, it must traverse a 6-rung logical ladder to guarantee the laziest, most minimal solution: Every deliberate simplification is documented inline: // ponytail: