The Design Doc Was Wrong. AI Trusted It Anyway.

wpnews.pro

cd /news/artificial-intelligence/the-design-doc-was-wrong-ai-trusted-… · home › topics › artificial-intelligence › article

[ARTICLE · art-35577] src=dev.to ↗ pub=2026-06-21T12:54Z topic=artificial-intelligence verified=true sentiment=↓ negative

The Design Doc Was Wrong. AI Trusted It Anyway.

A developer discovered that an AI-generated design document contained an arithmetic error that was faithfully reproduced by AI implementation and review tools, leading to incorrect calculations in a side project. The formula for deriving a value from two stored fields used a divisor of 10,000 instead of the correct 1,000,000, causing results to be off by a factor of 100. The incident highlights the risk of treating design documents as verified artifacts in AI-assisted development workflows.

read4 min views1 publishedJun 21, 2026

I was building a new module for a side project, and the design document looked thorough. Claude had written it, I had reviewed it, and everything seemed reasonable. One section described how to calculate a derived value from two stored fields — both multiplied by 100 to preserve two decimal places — and gave the formula:

Result = fieldA × 100 × fieldB × 100 / 10000

The logic made sense on the surface. Both fields carried 100x precision, so the divisor cancels them out. DeepSeek implemented it. The review agent verified it. I approved it. Then I opened the simulator and saw a value that should have been 1,200 displayed as 120,000.

The correct divisor was 1,000,000, not 10,000. When you multiply two fields that are each inflated by 100x, the product is inflated by 10,000x — so you need to divide by 1,000,000 to get back to base units, not 10,000. The design document had gotten this wrong, and everything downstream had faithfully reproduced that mistake.

That part was easy to fix once I saw it. What I kept thinking about afterward was the process that had failed.

A human engineer reading that formula would probably do something instinctive: substitute real numbers. Two quantities, each stored as integer × 100. Say fieldA = 10000 and fieldB = 1200. Multiply: 12,000,000. Divide by 10,000: 1,200. But the expected result should be 12. That doesn't add up. The error surfaces in about five seconds, not through formal verification, just through the habit of running a quick sanity check before trusting a formula.

None of the three systems in my workflow did this. DeepSeek read the design document and implemented the formula exactly as written. The review agent checked whether the implementation matched the design and confirmed that it did. I read both and approved them. Every statement was technically correct. The document was the source of truth, the code faithfully implemented the document, and the review verified the match. Nobody ran the numbers.

After I fixed the divisor, I started thinking about what had actually failed — and it wasn't the math. The math error was trivial once I looked at it carefully. What failed was a deeper assumption I had been making about how AI-assisted development works.

I had been treating design documents as verified artifacts. The workflow in my head was:

Design Document → AI Implementation → AI Review → Human Approval

The implicit assumption embedded in this workflow was that the design document itself was correct. If it wasn't, the whole chain would faithfully reproduce the error — which is exactly what happened.

A design document is not a verified fact. It is a claim made by whoever wrote it, and in this case that was Claude. Claude is very good at generating coherent, plausible-looking technical documentation. It is considerably less good at verifying whether the math embedded in that documentation is actually correct. I had been asking it to do both things at once — reason about architecture and verify arithmetic — without recognizing that it handles those two responsibilities very differently. It did the first well. It did the second poorly. And I hadn't thought to separate them.

The fix I've settled on is simple: any formula in a design document is treated as unverified until someone has substituted concrete numbers and checked the result by hand. Not because AI gets formulas wrong frequently, but because when it does, the error propagates cleanly and invisibly through every downstream step. The formula is wrong, the implementation is wrong, the review confirms the implementation matches the formula, and everything is consistent. Everything is wrong.

Most bugs in AI-generated code are implementation bugs — a condition is inverted, a null check is missing, an edge case is unhandled. Those are failures where the code doesn't implement the intent correctly. This was different. The code implemented the intent perfectly. The intent was wrong. That category of error is much harder to catch with code review, because code review asks whether the code does what it's supposed to do. The answer here was yes. Nobody asked whether what it was supposed to do was actually correct.

That question — is the specification right, not just the implementation — is increasingly where I think human judgment matters most in an AI-assisted workflow. AI is getting quite good at turning specifications into working code. It is getting reasonably good at catching implementation bugs. But it tends to treat the specification itself as ground truth, which means the work of questioning whether the spec makes sense in the first place is still entirely yours. AI will implement whatever you give it, with increasing reliability. The question is whether what you gave it was right.

And right now, that part is still on you.

source & further reading

dev.to — original article The Playwright Playbook — Part 8: Playwright Meets AI — Agents, MCP & Self-Healing Tests AI Fixed The Bug. Then I Found Two More Just Like It. Who actually wrote that commit... you, or your AI agent?

~/api · this article 200

$curl api.wpnews.pro/v1/news/the-design-doc-was-wrong…

Read original on dev.to → dev.to/antonio_zhu_e726fd856cd86/the-design-doc-…

mentioned entities

Claude

DeepSeek

metadata

slugthe-design-doc-was-wrong-ai-trusted-it-anyway

topic#artificial-intelligence

secondary2 topics

sentimentnegative

canonicaldev.to

navigation

← prevWho actually wrote that commit..…

next →AI Fixed The Bug. Then I Found T…

── more in #artificial-intelligence 4 stories · sorted by recency

dev.to · 21 Jun · #artificial-intelligence

Your AI agent has sudo. I built a tool to take it away.

dev.to · 21 Jun · #artificial-intelligence

The Playwright Playbook — Part 8: Playwright Meets AI — Agents, MCP & Self-Healing Tests

dev.to · 21 Jun · #artificial-intelligence

The Perfect AI SEO Playbook (And Why You Shouldn't Follow It)

theguardian.com · 21 Jun · #artificial-intelligence

A viral doomsday scenario aims to shake Europe out of its AI complacency

── more on @claude 3 stories trending now

wpnews · 20 Jun · #ai-agents

Amazon Bedrock AgentCore Memory: Build AI Agents That Remember

wpnews · 20 Jun · #artificial-intelligence

Microsoft is rewriting the economics of enterprise AI and the bill shock is just getting started

wpnews · 20 Jun · #artificial-intelligence

Big Tech redirects buybacks into AI capital spending

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required