cd /news/large-language-models/60-of-my-312-anthropic-bill-came-fro… · home topics large-language-models article
[ARTICLE · art-37062] src=dev.to ↗ pub= topic=large-language-models verified=true sentiment=· neutral

60% of My $312 Anthropic Bill Came From One Missing Pattern: Compensating Actions

A developer reduced their monthly Anthropic bill from $312 to $156 by fixing a missing compensating action pattern in a multi-step agent workflow. The pipeline, which runs Claude Sonnet for ad analytics, was restarting from scratch on failure, duplicating LLM calls and storage writes. The fix adds checkpointed caching and rollback functions to skip completed steps on retry.

read2 min views9 publishedJun 24, 2026

Last month's Anthropic invoice was $312. After one architectural change, May came in at $156 — exactly half. The culprit wasn't prompt bloat or model choice. It was the absence of compensating actions in my multi-step agent workflow.

The pattern is embarrassingly common: a 5-step pipeline fails at Step 4, so you restart from the top. Every restart re-runs every LLM call before the failure point. My ad analytics SaaS runs Claude Sonnet to summarize raw data in Step 2. That step averages ~8K input tokens per advertiser. At $3/M tokens (Sonnet 3.7), one restart costs $0.024 — trivial alone, but I have 200+ advertisers and this pipeline was failing repeatedly. Step 2 alone burned $40–50 in duplicated calls over April.

The deeper problem: I had no rollback mechanism at all. When Step 5 (a Slack webhook to an advertiser portal) failed with a 503 on a cold-start Worker, R2 already had the file, D1 already had the log row. Restarting the pipeline created duplicate files, duplicate database rows, and one advertiser asking why they got the same report twice. I'd assumed "restart = safe." That assumption was wrong.

The fix has two parts. First, I write a pipeline_runs

row at the start of every run, updating it with a step_completed

checkpoint and a step_output_ref

(the actual R2 key or D1 row ID) after each step succeeds. Second, on failure, a rollbackPipelineRun()

function reads those refs and deletes whatever was written — R2 file gone, D1 row gone, status flipped to rolled_back

. On retry, the agent checks for an existing in-progress run and skips already-completed steps entirely:

if (existingRun && existingRun.step_completed >= 2) {
  summary = existingRun.cached_summary; // no Claude call
} else {
  summary = await callClaude(data);
}

One thing idempotency keys don't solve here: they prevent duplicate side effects, but they don't prevent re-spending tokens on an identical LLM call. You need both — idempotency on the storage writes and checkpointed caching on the inference steps.

There are still rough edges: a race condition when two runs start simultaneously for the same advertiser (D1 doesn't fully guarantee serializable isolation between a SELECT and INSERT), and no clean answer for truly irreversible actions like sent emails or processed payments.

I wrote up the full breakdown — including the race condition I haven't fixed yet and why Durable Objects might be the answer — over on riversealab.com.

── more in #large-language-models 4 stories · sorted by recency
── more on @anthropic 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/60-of-my-312-anthrop…] indexed:0 read:2min 2026-06-24 ·