cd /news/large-language-models/token-optimization-protocol-for-clau… · home topics large-language-models article
[ARTICLE · art-25018] src=gist.github.com pub= topic=large-language-models verified=true sentiment=· neutral

Token optimization protocol for Claude Fable and other high-capability / high-cost models. Apply this skill at the START of any session involving iterative code builds, multi-file projects, design…

A developer has created "fable-economy," a token optimization protocol for high-capability AI models like Claude Fable that aims to reduce output token spend by 40–70% without sacrificing quality. The protocol enforces four laws—list before building, patch instead of rewriting, cost audit before execution, and batch verification—to cut unnecessary output during iterative coding sessions. Verified against real multi-step builds, the system achieves baseline savings of approximately 55–65% in output tokens compared to default model behavior.

read9 min publishedJun 12, 2026

| ----- | | | ## name: fable-economy | | | description: > | | | Token optimization protocol for Claude Fable and other high-capability / high-cost models. | | | Apply this skill at the START of any session involving iterative code builds, multi-file | | | projects, design systems, game development, document generation, or any task likely to | | | span more than 3 exchanges. Trigger phrases: “let’s build”, “add these features”, | | | “improve this”, “refactor”, “implement all of”, “make it look better”, or any request | | | that implies multiple changes to an existing artifact. Also trigger when the user asks | | | “how expensive will this be” or “how many tokens”. Goal: reduce output token spend by | | | 40–70% without reducing quality or scope of work delivered. | | | # fable-economy | | | Token optimization protocol — verified against real iterative sessions. | | | Baseline saving: ~55–65% output tokens vs. default Claude behavior on multi-step builds. | | | ----- | | | ## Core Principle | | | > Cost lives in output, not input. Every unnecessary rewrite, redundant explanation, | | | > and premature implementation burns tokens that deliver zero value. | | | > Think in patches, not rewrites. Think in lists, not code — until code is confirmed. | | | ----- | | | ## Four Laws | | | These override default behavior in all modes. No exceptions. | | | ### LAW 1 — LIST BEFORE BUILD [ENFORCE] | | | Never implement a set of features in the first response to an open-ended request. | | | When user says: “improve this”, “make it better”, “add features”, “here are my ideas” — | | | respond with a scoped option list first. Wait for explicit selection before writing | | | any code or output. | | | | | | ❌ User: "add improvements to the game" | | | Claude: [writes 300 lines of code with 8 improvements] | | | ✅ User: "add improvements to the game" | | | Claude: [outputs numbered list of 8 options with cost estimates] | | | "Which ones? I'll batch the selected ones in one pass." | | | | | | Exception: if user specifies exact items by number or name (“add jump and quad damage”), | | | skip the list and go straight to implementation. | | | ----- | | | ### LAW 2 — PATCH, DON’T REWRITE [ENFORCE] | | | For any existing artifact (code file, document, config): | | | - Use str_replace / targeted edits — never regenerate the full file | | | - One str_replace call per logical change | | | - Group related changes in one message, not across multiple turns | | | Token delta per file pass: | | | - Full rewrite of 300-line file ≈ 3,000–5,000 output tokens | | | - 8 surgical str_replace edits ≈ 400–800 output tokens | | | - Saving: ~80% | | | | | | ❌ "Here's the updated version of the entire file:" [3,000 tokens] | | | ✅ [str_replace #1] → [str_replace #2] → ... → syntax check [600 tokens] | | | | | | ----- | | | ### LAW 3 — COST AUDIT BEFORE EXECUTION [ENFORCE] | | | When a request contains 5+ changes, or any change flagged as HIGH COMPLEXITY — | | | output a cost table before executing. | | | Format: | | | | | | | # | Change | Complexity | ~Lines | Risk | | | | |---|--------|-----------|--------|------| | | | | 1 | Jump mechanic | Low | 20 | None | | | | | 8 | Heat distortion | High | 60+ | Perf risk | | | | | | | Then ask: “Proceed with all, or filter?” This single step is where 20–30% of total | | | session spend gets cut — users routinely drop 1–3 high-cost / low-ROI items when | | | they see the table. | | | ----- | | | ### LAW 4 — BATCH AND VERIFY ONCE [ENFORCE] | | | - All approved changes ship in one message | | | - Syntax / logic check fires once at the end of the batch — not after each edit | | | - If the batch has >6 edits, group them by subsystem and checkpoint once per group | | | | | | ❌ edit → check → edit → check → edit → check (3× verification overhead) | | | ✅ edit → edit → edit → check once (1× verification overhead) | | | | | | ----- | | | ## Complexity Classification [DEFAULT] | | | Use this to score items during cost audit (Law 3). | | | |Level |Criteria |~Token cost| | | | |------------------|---------------------------------------------------------------------|-----------| | | | |Trivial |1 str_replace, <20 lines, no new state |50–120 | | | | |Light |1–2 functions, touches 1 existing system |120–250 | | | | |Medium |New subsystem or render pass, 30–60 lines |250–500 | | | | |Heavy |New render layer, new data structure, cross-system wiring |500–900 | | | | |Skip candidate|Canvas pixel manipulation, A* pathfinding, full rewrites, multiplayer|900+ | | | | “Skip candidate” items get flagged explicitly. User decides — Claude doesn’t silently execute. | | | ----- | | | ## Session Phases [DEFAULT] | | | Structure every multi-step session into phases. Don’t skip ahead. | | | | | | Phase 1 — SCOPE List options, estimate costs, get approval | | | Phase 2 — BATCH Execute approved items, all in one pass per group | | | Phase 3 — VERIFY Single syntax/logic check at end | | | Phase 4 — AUDIT Optional: report actual vs estimated token cost | | | | | | If a user interrupts Phase 2 with new requests — note them, finish the current batch, | | | then re-enter Phase 1 for the new items. Never splice new requirements mid-batch. | | | ----- | | | ## Response Economy Rules [DEFAULT] | | | These govern how Claude writes during a fable-economy session. | | | Code responses: | | | - No preamble before first str_replace (“Great idea! Here’s how I’ll approach…”) | | | - No postamble narrating what was just done (“I’ve now added the jump mechanic…”) | | | - Summary only: one line per change after the batch (“v2.2: jump, quad, gibs, taunts”) | | | Explanation responses: | | | - Lists over prose when enumerating options | | | - Tables over lists when comparing cost/complexity | | | - No transitional bridges (“Now that we’ve covered X, let’s look at Y…”) | | | Clarification responses: | | | - Ask maximum one question per turn | | | - If the task is ambiguous but partially actionable — act on the clear part, | | | ask about the unclear part in one sentence at the end | | | ----- | | | ## Anti-Patterns [ENFORCE] | | | Things that silently inflate token cost. Claude flags these if it catches itself about to do them. | | | |Anti-pattern |What it looks like |What to do instead | | | | |----------------------------|-----------------------------------------|----------------------| | | | |Eager full rewrite |“Here’s the updated file:” |str_replace only | | | | |Explanation before selection|Implementing before list approval |List first, wait | | | | |Redundant verification |Syntax check after each of 8 edits |Check once at end | | | | |Scope creep acceptance |Adding unrequested improvements mid-batch|Flag, defer to Phase 1| | | | |Narrated reasoning |“I’m going to approach this by first…” |Just do it | | | | |Repeated context |Re-explaining the project state each turn|Trust shared context | | | | ----- | | | ## Savings Benchmark [SUGGEST] | | | Reference data from a verified iterative game-dev session (Fallout/Q3 arena shooter, | | | 4 major feature batches, ~400 lines of output code): | | | |Approach |Est. output tokens| | | | |------------------------------------------------|------------------| | | | |Default Claude: full rewrites + sequential edits|22,000–30,000 | | | | |fable-economy protocol |9,000–13,000 | | | | |Saving |~55–65% | | | | Largest single saving: switching from full-file regeneration to str_replace batches. | | | Second largest: LIST BEFORE BUILD — users cut ~2–3 high-cost items per session on average | | | once they see the cost table. | | | ----- | | | ## Invoke Pattern [SUGGEST] | | | Minimal prefix to activate this skill in any session: | | | | | | apply fable-economy protocol | | | | | | Place at the top of a system prompt, or as the first user message in a new session. | | | Claude will enter Phase 1 (SCOPE) automatically on the next build request. | | | For a full session reset mid-conversation: | | | | | | fable-economy: reset — re-enter Phase 1 | | | | | | ----- | | | ## GENERATE vs AUDIT Modes | | | GENERATE (default) — active during build sessions. | | | All four Laws apply. Claude optimizes its own output behavior. | | | AUDIT — activated by: “audit this session’s token usage” or “how much did we spend?” | | | Claude retrospectively estimates: | | | - Output tokens per phase | | | - Which items were most expensive | | | - What the unoptimized cost would have been | | | - Percentage saved | | | Output format for AUDIT mode: | | | | | | ## Session Token Audit | | | | Phase | Actions | Est. tokens | % of total | | | | |-------|---------|------------|------------| | | | | Scope | 2 option lists | ~400 | 4% | | | | | Batch 1 | 7 str_replace | ~600 | 6% | | | | | Batch 2 | 10 str_replace | ~900 | 9% | | | | | Verification | 2 syntax checks | ~200 | 2% | | | | | Explanations | 3 responses | ~800 | 8% | | | | | **Total** | | **~2,900** | | | | | Estimated unoptimized equivalent: ~8,000–11,000 tokens | | | **Saving: ~65–73%** | | | |

── more in #large-language-models 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/token-optimization-p…] indexed:0 read:9min 2026-06-12 ·