Changes that cut our LLM pipeline costs more than model-switching did

wpnews.pro

cd /news/large-language-models/changes-that-cut-our-llm-pipeline-co… · home › topics › large-language-models › article

[ARTICLE · art-34862] src=news.ycombinator.com ↗ pub=2026-06-20T13:05Z topic=large-language-models verified=true sentiment=↑ positive

Changes that cut our LLM pipeline costs more than model-switching did

A developer reports that switching from JSON to TOON for structured output, using condensed markdown, and replacing long instruction lists with multi-shot examples reduced their LLM pipeline costs by 60%, surpassing savings from model-switching alone.

read1 min views1 publishedJun 20, 2026

I have been building multiple LLM systems and for our Organization biggest cost savings weren't from prompt-wordsmithing or model switchings. Sharing useful to anyone watching their token bill :

JSON → TOON for structured output: JSON was not made for LLMs. well you can implement your own verison that fits for your needs that reduce tokens usage but what worked for us was TOON. TOON cut output our tokens by ~30% same information, way less syntax tax.
Full markdown/HTML → condensed markdown: Using markdown for writing your prompts, getting intermediate results or communication between your Agents eats a lot of tokens. We swithced to condesed markdown and short system prompts that replicate Caveman. this alone cut just on input token costs ~50% on calls that pass prior context forward which can be implemented between Agent Calls.
Long Do/Don't instruction lists → 2-3 multi-shot examples: Counterintuitive one - replacing a large lists of DO's and Don'ts for agents rules don't help. rather couple of concrete examples that convers major and all cases actually improved output quality more reliably and it's usually fewer tokens once the instruction list gets long enough to cover real edge cases.

I have seen most people on this sub reddit talk about using open-source or cheaper models. Like we were spending thousands of dollar's but this all changes alone helped reduce cost by 60%.

edit: Open to Discussion, anyone whether something similar would help their setup.

Comments URL: [https://news.ycombinator.com/item?id=48608978](https://news.ycombinator.com/item?id=48608978)

Points: 2

source & further reading

news.ycombinator.com — original article Convert your landing pages to powerful visuals for social media Fli -a tiny (18KB) easy to read file listing tool. Rust no_std and Libc Ask HN: Will we start seeing tools for LLM use?

~/api · this article 200

$curl api.wpnews.pro/v1/news/changes-that-cut-our-llm…

Read original on news.ycombinator.com → news.ycombinator.com/item?id=48608978

mentioned entities

TOON

metadata

slugchanges-that-cut-our-llm-pipeline-costs-more-than-model-switching-did

topic#large-language-models

secondary2 topics

sentimentpositive

canonicalnews.ycombinator.com

navigation

← prevAg.ide Index, rank, and refactor…

next →When pytest Said "Passed," It Wa…

── more in #large-language-models 4 stories · sorted by recency

technicalstrat.com · 20 Jun · #large-language-models

Two production Next.js apps, built solo with Cursor+Claude, $13,945

devclubhouse.com · 20 Jun · #large-language-models

Apple Core AI and the Local LLM Tax

augmentedswe.com · 20 Jun · #large-language-models

Claude Code learning hub

dev.to · 20 Jun · #large-language-models

"I Stopped Pretending Every AI Provider Was the Same"

── more on @toon 3 stories trending now

wpnews · 19 Jun · #artificial-intelligence

From Dream Job to 'The Gulag': Inside Staff Revolt Zuckerberg's Brutal AI Push

wpnews · 19 Jun · #artificial-intelligence

Stop Guessing Which Library to Use — I Built an AI Capability Discovery Engine

wpnews · 19 Jun · #large-language-models

I Cut My AI Agent's Token Bill by 62% in One Weekend. Here's the Receipts.

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required