23:48
2026-06-15
dev.to
large-language-models
Prompt caching vs the long LLM conversation: where your input bill actually hides
A developer built PromptCrunch, a drop-in proxy that reduces input token costs in long multi-turn LLM conversations by deduplicating code, compacting stale tool output, and summarizing old turns. In tโฆ