cd /news/large-language-models/changes-that-cut-our-llm-pipeline-co… · home topics large-language-models article
[ARTICLE · art-34862] src=news.ycombinator.com ↗ pub= topic=large-language-models verified=true sentiment=↑ positive

Changes that cut our LLM pipeline costs more than model-switching did

A developer reports that switching from JSON to TOON for structured output, using condensed markdown, and replacing long instruction lists with multi-shot examples reduced their LLM pipeline costs by 60%, surpassing savings from model-switching alone.

read1 min views1 publishedJun 20, 2026

I have been building multiple LLM systems and for our Organization biggest cost savings weren't from prompt-wordsmithing or model switchings. Sharing useful to anyone watching their token bill :

  1. JSON → TOON for structured output: JSON was not made for LLMs. well you can implement your own verison that fits for your needs that reduce tokens usage but what worked for us was TOON. TOON cut output our tokens by ~30% same information, way less syntax tax.

  2. Full markdown/HTML → condensed markdown: Using markdown for writing your prompts, getting intermediate results or communication between your Agents eats a lot of tokens. We swithced to condesed markdown and short system prompts that replicate Caveman. this alone cut just on input token costs ~50% on calls that pass prior context forward which can be implemented between Agent Calls.

  3. Long Do/Don't instruction lists → 2-3 multi-shot examples: Counterintuitive one - replacing a large lists of DO's and Don'ts for agents rules don't help. rather couple of concrete examples that convers major and all cases actually improved output quality more reliably and it's usually fewer tokens once the instruction list gets long enough to cover real edge cases.

I have seen most people on this sub reddit talk about using open-source or cheaper models. Like we were spending thousands of dollar's but this all changes alone helped reduce cost by 60%.

edit: Open to Discussion, anyone whether something similar would help their setup.

Comments URL: [https://news.ycombinator.com/item?id=48608978](https://news.ycombinator.com/item?id=48608978)

Points: 2

── more in #large-language-models 4 stories · sorted by recency
── more on @toon 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/changes-that-cut-our…] indexed:0 read:1min 2026-06-20 ·