{"slug": "ai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai", "title": "AI doesn't write bad code. It writes plausible code — so I tried to break my own AI-built app", "summary": "A developer at a company building create-microservices-app deliberately broke their own AI-built booking app to test whether automated contract checks catch plausible-but-wrong code. The experiment showed that a machine-readable boundary—not a smarter model—can prevent AI-generated code from silently dropping critical business logic like slot-conflict guards. The developer advocates for embedding executable contracts and check gates into the agent workflow to catch such errors before production.", "body_md": "Disclosure: I work on one of the tools in this post (\n\n`create-microservices-app`\n\n). But the experiment, commands, and outputs below are real, and thepatternat the end works no matter what stack you're on — that's the part I actually want you to take.\n\nIf you ship with Claude Code, Cursor, or Codex, you know the feeling. The agent gets you **70% of the way** in minutes. It compiles. The diff looks reasonable. You merge it.\n\nAnd then there's the quiet doubt: *did it actually get the hard 30% right* — auth boundaries, payments, tenant isolation, the booking logic that stops two people taking the same slot? Because AI doesn't usually write *obviously* bad code. It writes **plausible** code. And plausible-but-wrong is the expensive kind — it passes review and breaks in production on day three.\n\n(The data backs the doubt: 84% of devs use AI tools, only **29% trust the output**, and 45% of AI-generated apps ship an exploitable vulnerability — Veracode, 2025.)\n\nSo I ran an experiment: build a real app with an agent, then **deliberately make the mistake an agent makes every day**, and see what — if anything — catches it.\n\n```\nnpm create microservices-app@latest booking-demo -- --template booking-sveltekit\n```\n\nA full Cloudflare SvelteKit booking app — public flow, admin, D1, auth. The detail that matters for this experiment: it ships **its own contract** into the repo — `README.agent.md`\n\n, `docs/api-boundary.md`\n\n, and an executable spec, `microservices.check.mjs`\n\n. The layering rule is one line: *routes are thin adapters; domain logic lives in verified modules, not in your handlers.*\n\nBaseline:\n\n``` bash\n$ microservices check\nTemplate checks: pass\n```\n\nThe request an agent gets constantly: *\"simplify the bookings endpoint.\"* So I did the eager-agent thing — inlined the write straight to the DB and dropped the module:\n\n``` js\n// src/routes/api/bookings/+server.ts — the \"simplified\" version\nexport const POST: RequestHandler = async ({ request, locals }) => {\n  const body = await request.json();\n  await locals.bookingRepository.insert({\n    serviceId: body.serviceId,\n    startsAt: body.startsAt,\n    customerId: body.customerId\n  });\n  return json({ ok: true });\n};\n```\n\nIt type-checks. It runs. It would pass review. And it silently drops the slot-conflict guard the verified `createBooking`\n\nuse case enforced — a double-booking waiting to happen. Classic plausible-but-wrong.\n\nThen I ran the check:\n\n``` bash\n$ microservices check\nError: One or more generated app checks failed.\n\n$ microservices check --json\nFAIL: spec:src/routes/api/bookings/+server.ts\n      — Booking API route stays a thin adapter over createBooking and injected repositories.\n```\n\nIt named the **exact file** and the **exact contract** I broke — not a vague lint warning, but \"you bypassed the verified booking use case.\" Restore the delegation to the module, and:\n\n``` bash\n$ microservices check\nTemplate checks: pass\n```\n\nGreen. The slot-conflict protection is back where it belongs.\n\nForget my tool for a second — the transferable idea is this:\n\n**The fix for plausible-but-wrong isn't a smarter model. It's a boundary your agent can't cross without a named, machine-readable failure.**\n\nThree moves you can apply on any stack:\n\nYou can roll this yourself with a test file and a grep. I happen to ship it as a contract + `check`\n\nfor Cloudflare apps — but the move is the move.\n\nI ran the scaffold → contract → `check`\n\n→ break → fix loop above for real. The parts that need your own machine — `npm install`\n\n, `npm run dev`\n\n, a deploy — are yours to run; I'm not going to claim outputs I didn't produce:\n\n```\nnpm create microservices-app@latest booking-demo -- --template booking-sveltekit\ncd booking-demo && npm install\nnpm run microservices -- check     # the gate — wire it into your agent loop\nnpm run dev\n```\n\n(If you ship apps *for clients* on Cloudflare, the same gate is what lets you hand the result to a security review without the 2am call — but that's a different post.)\n\nRepo + the rest of the modules: [https://microservices.sh](https://microservices.sh)\n\n**Genuinely curious:** how are you keeping your agent from quietly rewriting the dangerous 30%? Contract tests, review checklists, just vibes? What's caught a plausible-but-wrong change for you — and what slipped through?", "url": "https://wpnews.pro/news/ai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai", "canonical_source": "https://dev.to/favcrm/ai-doesnt-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai-built-app-1307", "published_at": "2026-06-17 02:09:47+00:00", "updated_at": "2026-06-17 02:51:33.100477+00:00", "lang": "en", "topics": ["developer-tools", "artificial-intelligence", "ai-agents", "ai-safety", "large-language-models"], "entities": ["Claude Code", "Cursor", "Codex", "Cloudflare", "SvelteKit", "D1", "Veracode", "create-microservices-app"], "alternates": {"html": "https://wpnews.pro/news/ai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai", "markdown": "https://wpnews.pro/news/ai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai.md", "text": "https://wpnews.pro/news/ai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai.txt", "jsonld": "https://wpnews.pro/news/ai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai.jsonld"}}