60% of My $312 Anthropic Bill Came From One Silent Loop — Here's How I Found It

wpnews.pro

cd /news/large-language-models/60-of-my-312-anthropic-bill-came-fro… · home › topics › large-language-models › article

[ARTICLE · art-36026] src=dev.to ↗ pub=2026-06-22T01:11Z topic=large-language-models verified=true sentiment=· neutral

60% of My $312 Anthropic Bill Came From One Silent Loop — Here's How I Found It

An engineer discovered that 60% of a $312 monthly Anthropic bill came from a single retry loop in a Claude Code agent. The culprit was found by shipping Workers logs to R2 via Logpush and querying with DuckDB, revealing that one worker consumed 58% of total input tokens. The engineer recommends using KV counters for multi-agent loops and notes an unresolved issue with intermittent schema drift in tool call responses.

read2 min views1 publishedJun 22, 2026

Last month's Anthropic invoice: $312. Sixty percent of it traced back to a single retry pattern I couldn't see anywhere in my normal logs.

The agent was failing on tool calls, then re-entering the loop with the full context intact — 18K input tokens per invocation on a task that needs 3-4K. Claude Code's UI looked fine. Workers logs showed 200s. D1 writes were clean. The billing dashboard just said "tokens used" with no breakdown by worker or call chain.

I found the culprit only after shipping Workers logs to R2 via Logpush and querying with DuckDB:

SELECT
  worker_name,
  COUNT(*) as call_count,
  AVG(input_tokens) as avg_input,
  SUM(input_tokens) as total_input
FROM read_parquet('s3://my-logs/workers/2026-05/*.parquet')
GROUP BY worker_name
ORDER BY total_input DESC;

One worker — ad-report-summarizer

— was eating 58% of total input tokens. That query cost me maybe 20 minutes to set up. The Logpush + R2 + DuckDB stack runs under $5/month.

Once I had a suspect, I used Claude Code's --verbose

flag to reconstruct the tool call chain. Most people treat --verbose

as a log-level toggle. It's not — it dumps the full tool input/output JSON for every call in the session. Pipe it to a file, run jq

on it, and you can replay the exact sequence that blew up your context.

For multi-agent loops specifically (I run 6 Slack bots coordinated through Workers), KV counters have been the single most reliable safeguard. A counter keyed to the conversation thread, checked on every bot invocation, with a last_actor

field — when the counter approaches the limit, last_actor

tells you immediately which bot is driving the chain. Six months in, it's almost always summarizer-bot

triggering router-bot

triggering summarizer-bot

again.

The harder unsolved problem: I'm still seeing intermittent schema drift in tool call responses — same prompt, same model, valid JSON but different structure. It's non-deterministic, doesn't reproduce on demand, and when it triggers a retry, costs double. I haven't confirmed whether it's a Sonnet serialization quirk or something in my Workers pipeline.

I wrote up the full breakdown — including the PostToolUse

hook setup for snapshotting tool call sequences, the cf-ray

correlation trick for tracing multi-worker chains, and the per-tool production evaluation table — over on riversealab.com.

source & further reading

dev.to — original article The Asymmetric Fallacy: Why the Claude Fable Ban Hurts Cloud Defenders AI API Price War: DeepSeek V4-Pro Cuts 75% & Gemini 3.5 Flash Lands graphlens: a polyglot code-analysis framework that turns your repo into a typed graph

~/api · this article 200

$curl api.wpnews.pro/v1/news/60-of-my-312-anthropic-b…

Read original on dev.to → dev.to/riversea/60-of-my-312-anthropic-bill-came…

mentioned entities

Anthropic

Claude Code

Workers

DuckDB

Logpush

Sonnet

metadata

slug60-of-my-312-anthropic-bill-came-from-one-silent-loop-here-s-how-i-found-it

topic#large-language-models

secondary3 topics

sentimentneutral

canonicaldev.to

navigation

← prevMemory-chip shortage drives up c…

next →Static Website Comments Section

── more in #large-language-models 4 stories · sorted by recency

superserve.ai · 22 Jun · #large-language-models

Give your sandboxed agents API keys they can't read

dev.to · 22 Jun · #large-language-models

The Asymmetric Fallacy: Why the Claude Fable Ban Hurts Cloud Defenders

notes.ansonbiggs.com · 22 Jun · #large-language-models

You're probably using Agent Skills wrong

news.ycombinator.com · 22 Jun · #large-language-models

Ask HN: Are you being "529 Overloaded" by Anthropic too?

── more on @anthropic 3 stories trending now

wpnews · 21 Jun · #large-language-models

Anthropic faces a class action lawsuit accusing it of selling Claude Max subscribers far less than advertised

wpnews · 21 Jun · #artificial-intelligence

Plotting AI model release cadence: two labs are accelerating, three aren't

wpnews · 21 Jun · #ai-safety

Author Argues for Slower AI Despite Cancer Benefits

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required