My agent dry-ran fine in staging 100 times — then wrecked production on the first real run

wpnews.pro

cd /news/ai-agents/my-agent-dry-ran-fine-in-staging-100… · home › topics › ai-agents › article

[ARTICLE · art-45817] src=dev.to ↗ pub=2026-07-01T01:12Z topic=ai-agents verified=true sentiment=↓ negative

My agent dry-ran fine in staging 100 times — then wrecked production on the first real run

A developer building AI agents on Cloudflare Workers discovered that staging-to-production data bleeds caused production outages despite 100 successful dry runs. The engineer found that environment drift and non-deterministic agent execution paths led to fresh tool call sequences in production, while mock responses in dry-run mode created orphaned records when writes to D1 were not scoped. The fix involved propagating a dry-run flag via KV across all subsequent writes in the same run, and the developer warns that hook failures can bypass dry-run protection entirely.

read2 min views1 publishedJul 1, 2026

A staging-to-production data bleed cost me 4 hours of rollback. That's what finally made dry-run a structural requirement, not an afterthought.

The common advice is: test in staging, promote when green. The problem is environment drift. My D1 schema changes once or twice a week, and a solo operator can't keep staging perfectly synchronized. Worse, agents don't have fixed execution paths — the same input can produce a different tool call sequence on the next run. I ran a flow 100 times in staging and still hit a fresh path on the first production execution.

The most surprising thing I learned after 6 months of running this: latency wasn't the problem I expected. KV writes averaged 12ms — basically imperceptible. The real problem was that mock responses fool the agent into treating skipped writes as real successes. I'd dry-run an R2 put

, the agent would believe the file was uploaded, and then proceed to write metadata to D1 — which was not in dry-run scope. Real write, orphaned record.

The fix: once any write tool in a run hits dry-run, propagate a flag for that runId

that forces all subsequent writes in the same run to dry-run too.

// after intercepting first dry-run write
await ctx.env.KV.put(`dryrun_active:${ctx.runId}`, "1", {
  expirationTtl: 3600,
});

// every subsequent hook checks this flag
const isDryRunActive =
  (await ctx.env.KV.get(`dryrun_active:${ctx.runId}`)) === "1";

One more thing that burned me: if the hook itself fails — say, KV goes temporarily unavailable — Claude Code's default behavior is fall-through. The tool call executes anyway, dry-run flag ignored. Last week a KV spike caused hook timeouts and 3 agents wrote directly to production. No data loss because those ops were idempotent, but it was luck. Hook failure needs its own alert, separate from agent failure.

I wrote up the full breakdown — including the dry-run propagation edge cases, R2 + D1 orphan scenarios, and where this pattern completely falls apart (read-modify-write loops, APIs with side-effectful reads) — over on riversealab.com.

source & further reading

dev.to — original article Maintaining WordPress sites behind HTTP Basic auth — Playwright, urllib, and encrypted credentials I built an AI-powered QA platform because manual testing tools haven't kept up — launching on Product Hunt today A FalkorDB Vector Search Gotcha: Why Won't db.idx.vector.queryNodes Work?

~/api · this article 200

$curl api.wpnews.pro/v1/news/my-agent-dry-ran-fine-in…

Read original on dev.to → dev.to/riversea/my-agent-dry-ran-fine-in-staging…

mentioned entities

Cloudflare Workers

Claude Code

riversealab.com

metadata

slugmy-agent-dry-ran-fine-in-staging-100-times-then-wrecked-production-on-the-first

topic#ai-agents

secondary2 topics

sentimentnegative

canonicaldev.to

navigation

← prevAMD unveils Versal Premium Gen2 …

next →Claude Sonnet 5 Launches: What t…

── more in #ai-agents 4 stories · sorted by recency

devclubhouse.com · 20 Jun · #ai-agents

The Agentic Sysadmin: Analyzing Cloudflare’s Temporary Accounts for AI

dev.to · 1 Jul · #ai-agents

`wrangler dev --remote` silently writes to your production KV namespace — here's the fix

latent.space · 30 Jun · #ai-agents

Ahmad Osman on why local AI is catching up

dev.to · 1 Jul · #ai-agents

"How to Stop AI Agent Skills, Hooks, and Cron Jobs from Silently Conflicting Over Where They Run and What Data They Trust"

── more on @cloudflare workers 3 stories trending now

wpnews · 30 May · #ai-tools

I was wasting 10 minutes every Claude session. So I built a fix.

wpnews · 27 May · #machine-learning

hunting for headroom on modded-nanoGPT (WR #82)

wpnews · 2 Jun · #ai-products

Microsoft launches Discovery platform for scientific R&D with Ginkgo Bioworks partnership

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required