cd /news/ai-infrastructure/a-one-line-cache-key-bug-cost-me-187… Β· home β€Ί topics β€Ί ai-infrastructure β€Ί article
[ARTICLE Β· art-40173] src=dev.to β†— pub= topic=ai-infrastructure verified=true sentiment=↓ negative

A one-line cache key bug cost me $187/month and leaked advertiser data across tenants

A developer at an ad analytics SaaS discovered that a missing tenant ID in an MCP router cache key caused Cloudflare Workers to serve cached Vectorize results from one advertiser to another, leading to a $187/month cost overrun and data leakage. The one-line fix reduced Anthropic API costs by 60% and Vectorize queries by 40%. The bug exploited the fact that V8 isolate boundaries do not isolate concurrent requests on the same warm Worker instance, allowing module-level cache objects to be shared across tenants.

read2 min views1 publishedJun 26, 2026

60% of my $312 Anthropic bill last month came from a single bug: an MCP router cache key that was missing a tenant ID.

The fix was literally this:

// before
const cacheKey = `mcp:context:${requestId}`;

// after
const cacheKey = `mcp:context:${tenantId}:${requestId}`;

That one missing segment meant warm Cloudflare Worker instances were serving cached Vectorize results from advertiser A into advertiser B's tool responses. In a production ad analytics SaaS. Not a demo.

The counterintuitive part: I assumed V8 isolate boundaries protected me. They don't β€” not in the way most people think. Isolate-level isolation applies between separate Worker deployments, not between two concurrent requests hitting the same warm Worker instance. Module-scope variables survive across requests. So any context manager or cache object you initialize at module level is shared state, even on Workers.

The failure mode was subtle enough to take 6 weeks to find. Vectorize query volume was 3Γ— expected β€” that was the first signal. Digging into logs, I found cache hits for tenant a9f2

being served to sessions belonging to tenant b3c1

. The corrupted cache contained vector search results, so every bad hit triggered a downstream re-fetch chain. That cascade is what blew up the token spend: wrong cache data β†’ Claude retries with fresh context β†’ Sonnet input tokens accumulate fast.

After fixing the cache key namespace and adding a PostToolUse

hook that throws on tenant ID mismatch in tool response metadata, Sonnet input costs dropped from ~$187/month to ~$94. Vectorize queries fell ~40% over the same period.

One thing worth flagging for anyone on a similar stack: this specific fix β€” scoping everything to Workers' ExecutionContext

per request β€” doesn't translate cleanly to long-running Node processes on something like Fly.io. There, AsyncLocalStorage

is the right primitive. Porting the Workers pattern directly will give you a false sense of safety.

I wrote up the full breakdown β€” including the PostToolUse

hook implementation, the KV/D1 cache key enforcement pattern, and the cases where this isolation design is overkill β€” over on riversealab.com.

── more in #ai-infrastructure 4 stories Β· sorted by recency
── more on @anthropic 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/a-one-line-cache-key…] indexed:0 read:2min 2026-06-26 Β· β€”