as developers, we are spending more and more time working alongside AI coding agents like Cursor, Claude Code, GitHub Copilot, Windsurf, or Cline.
But as your session grows, you quickly run into two major problems:
To solve this, I built TITAN (Token Intelligence Through Agent Narrowing): a universal, zero-dependency CLI framework designed to compress AI agent token consumption by 70% to 85% without degrading reasoning quality.
And to make things interesting, I wrote and shipped it this week entirely on my own, right in the middle of my high school final exams (la maturità here in Italy).
Here is how it works under the hood.
TITAN approaches token optimization not as a single post-processing step, but as three orthogonal, multiplicative layers:
Total Savings = 1 - ( (1 - L1_Savings) * (1 - L2_Savings) * (1 - L3_Savings) )
Instead of letting the LLM output standard verbose English prose (pleasantries, hedging, filler words, technical narrations), the Caveman Engine instructs the model to use a dense, telegraphese grammar:
basically
, actually
, likely
, probably
$\to$ removed.the
, a
, an
$\to$ removed (when safe)."Component re-renders"
instead of "The component is re-rendering"
.Before the agent writes a single line of code, it must traverse a 6-rung logical ladder to guarantee the laziest, most minimal solution:
Every deliberate simplification is documented inline: // ponytail: <ceiling>, <upgrade path>
(e.g. // ponytail: local memory cache, use Redis if multi-node setup is required
).
CLAUDE.md
) are compressed post-hoc to strip prose while keeping code conventions exact, saving up to 45% input tokens on every turn.
npm run build 2>&1 | titan filter
Following the structural (L2) rule of using the standard library, TITAN has zero external npm dependencies.
It uses Node.js native features (fs
, path
, readline
, child_process
, https
) for everything:
|
and >
).node:test
and node:assert
modules.To verify that compressing prompts doesn't degrade the AI's coding and reasoning capabilities, I built an evaluation harness into TITAN to measure Usable Intelligence Density (UID):
$$\text{UID} = \frac{\text{Avg Accuracy %}}{\text{Avg Total Tokens}} \times 1000$$
Here is how the variants perform under mock and empirical LLM runs over a 5-task suite (Coding, Debugging, Logic, Refactoring, and Code Review):
| Variant | Avg Accuracy | Avg In Tok | Avg Out Tok | Avg Tot Tok | UID (Density) | Status |
|---|---|---|---|---|---|---|
| Baseline | ||||||
| 100% | 50 | 198 | 248 | 403.2 | Reliable | |
| Caveman | ||||||
| 100% | 120 | 78 | 198 | 505.1 | Reliable | |
| Ponytail | ||||||
| 86% | 115 | 67 | 182 | 472.5 | Reliable | |
| TITAN Balanced | ||||||
| 100% | 1500 | 80 | 1580 | 63.3 | Reliable | |
| TITAN Lite | ||||||
| 100% | 425 | 91 | 516 | 193.8 | Reliable | |
| TITAN Aggressive | ||||||
| 79% | 400 | 50 | 450 | 175.7 | ⚠ Degraded |
TITAN
prompt reflects the cost of the full master ruleset. The titan_lite
variant balances prompt size and output compression beautifully.You can install TITAN globally from npm:
npm install -g titan-agent-cli
Then initialize the ruleset for your editor. For instance, to generate Cursor rules (.cursor/rules/titan.mdc
):
titan init --agent=cursor
titan init --agent=cursor --lite
To run the native unit tests locally:
titan test
And to scan your codebase for active technical debt ponytail comments:
titan debt
TITAN is fully open source. I’d love to get your thoughts, contributions, or a star on GitHub!
If you have any feedback on the standard library YAML parser or ideas on expanding adapters for new IDEs, let me know in the comments below!