{"slug": "i-spent-788-on-an-ai-coding-agent-in-one-day-here-s-the-breakdown", "title": "I spent $788 on an AI coding agent in one day. Here's the breakdown.", "summary": "A developer spent $788 on API calls from an AI coding agent in a single day, running 3,572 calls across four models. The breakdown reveals that 73% of calls went to the most expensive model, Fable 5, costing $617, while cheaper models could have handled most tasks. The developer built an open-source list of AI gateways with a reproducible cost benchmark to help others optimize model routing.", "body_md": "I left an AI coding agent running for one day. Then I read the invoice.\n\n**$788. In about 13 hours.**\n\nI'm posting the real breakdown because I think a lot of people are quietly running up this kind of bill without seeing where it goes — and the fix is boring and effective.\n\nOne day, 10:21–23:05. 11 sessions, **3,572 API calls** across 4 models:\n\n| Model | Calls | Output tokens | Cache-read tokens | Cost |\n|---|---|---|---|---|\n| Fable 5 ($10/$50) | 2,613 | 1.04M | 448M | ~$617 |\n| Opus 4.8 ($5/$25) | 671 | 769K | 248M | ~$168 |\n| Haiku 4.5 ($1/$5) | 242 | 27K | 9M | ~$1.70 |\n| Sonnet 4.6 ($3/$15) | 46 | 6K | 2M | ~$0.90 |\nTotal |\n3,572 | ~$788 |\n\nTwo numbers reframed how I think about this:\n\nThat's not a 2× or 3× gap. Per call it's a **~360× difference**, and I was sending almost everything to the expensive end out of pure default-laziness.\n\nNotice 448M + 248M = ~700M **cache-read** tokens. Agentic coding re-sends a big context every turn; cache reads are billed at ~0.1× input, which is the only reason this was $788 and not several thousand. The flip side: anything that breaks your cache (a changed timestamp, reordered tool list, a proxy that normalizes prompts) silently re-bills at full input price. On this volume, a broken cache is a 10× event.\n\nI didn't conclude \"stop using good models.\" I concluded \"stop sending *everything* to them.\" The pattern:\n\nThis is exactly what an **AI gateway / model router** does — it's the layer that lets you express \"cheap by default, escalate when it's hard\" once, instead of hard-coding a model everywhere. I've since taken the flagship out of the default path, and the same workload now lands in the low tens of dollars a day.\n\nWhile digging into routing I built an open-source, pain-point-organized list of AI gateways — with a **reproducible cost benchmark** that prices concrete workloads (including a coding scenario with reasoning tokens) across 11 models, computed by a unit-tested script. Plug in your own token mix and see your real number before the invoice does:\n\n** github.com/cuihuan/awesome-ai-gateway** ·\n\nIf you're running agents daily — have you actually looked at your per-model breakdown? I'd bet most of the bill is one model doing work a cheaper one could.", "url": "https://wpnews.pro/news/i-spent-788-on-an-ai-coding-agent-in-one-day-here-s-the-breakdown", "canonical_source": "https://dev.to/_7a561cb4673b6d2a455c5/i-spent-788-on-an-ai-coding-agent-in-one-day-heres-the-breakdown-4bom", "published_at": "2026-06-13 08:18:21+00:00", "updated_at": "2026-06-13 08:47:43.700258+00:00", "lang": "en", "topics": ["ai-agents", "ai-tools", "developer-tools", "large-language-models", "ai-infrastructure"], "entities": ["Fable 5", "Opus 4.8", "Haiku 4.5", "Sonnet 4.6", "github.com/cuihuan/awesome-ai-gateway"], "alternates": {"html": "https://wpnews.pro/news/i-spent-788-on-an-ai-coding-agent-in-one-day-here-s-the-breakdown", "markdown": "https://wpnews.pro/news/i-spent-788-on-an-ai-coding-agent-in-one-day-here-s-the-breakdown.md", "text": "https://wpnews.pro/news/i-spent-788-on-an-ai-coding-agent-in-one-day-here-s-the-breakdown.txt", "jsonld": "https://wpnews.pro/news/i-spent-788-on-an-ai-coding-agent-in-one-day-here-s-the-breakdown.jsonld"}}