AI doesn't write bad code. It writes plausible code — so I tried to break my own AI-built app

wpnews.pro

cd /news/developer-tools/ai-doesn-t-write-bad-code-it-writes-… · home › topics › developer-tools › article

[ARTICLE · art-30415] src=dev.to ↗ pub=2026-06-17T02:09Z topic=developer-tools verified=true sentiment=· neutral

AI doesn't write bad code. It writes plausible code — so I tried to break my own AI-built app

A developer at a company building create-microservices-app deliberately broke their own AI-built booking app to test whether automated contract checks catch plausible-but-wrong code. The experiment showed that a machine-readable boundary—not a smarter model—can prevent AI-generated code from silently dropping critical business logic like slot-conflict guards. The developer advocates for embedding executable contracts and check gates into the agent workflow to catch such errors before production.

read3 min views30 publishedJun 17, 2026

Disclosure: I work on one of the tools in this post (

create-microservices-app

). But the experiment, commands, and outputs below are real, and thepatternat the end works no matter what stack you're on — that's the part I actually want you to take.

If you ship with Claude Code, Cursor, or Codex, you know the feeling. The agent gets you 70% of the way in minutes. It compiles. The diff looks reasonable. You merge it.

And then there's the quiet doubt: did it actually get the hard 30% right — auth boundaries, payments, tenant isolation, the booking logic that stops two people taking the same slot? Because AI doesn't usually write obviously bad code. It writes plausible code. And plausible-but-wrong is the expensive kind — it passes review and breaks in production on day three.

(The data backs the doubt: 84% of devs use AI tools, only 29% trust the output, and 45% of AI-generated apps ship an exploitable vulnerability — Veracode, 2025.)

So I ran an experiment: build a real app with an agent, then deliberately make the mistake an agent makes every day, and see what — if anything — catches it.

npm create microservices-app@latest booking-demo -- --template booking-sveltekit

A full Cloudflare SvelteKit booking app — public flow, admin, D1, auth. The detail that matters for this experiment: it ships its own contract into the repo — README.agent.md

, docs/api-boundary.md

, and an executable spec, microservices.check.mjs

. The layering rule is one line: routes are thin adapters; domain logic lives in verified modules, not in your handlers.

Baseline:

$ microservices check
Template checks: pass

The request an agent gets constantly: "simplify the bookings endpoint." So I did the eager-agent thing — inlined the write straight to the DB and dropped the module:

// src/routes/api/bookings/+server.ts — the "simplified" version
export const POST: RequestHandler = async ({ request, locals }) => {
  const body = await request.json();
  await locals.bookingRepository.insert({
    serviceId: body.serviceId,
    startsAt: body.startsAt,
    customerId: body.customerId
  });
  return json({ ok: true });
};

It type-checks. It runs. It would pass review. And it silently drops the slot-conflict guard the verified createBooking

use case enforced — a double-booking waiting to happen. Classic plausible-but-wrong.

Then I ran the check:

$ microservices check
Error: One or more generated app checks failed.

$ microservices check --json
FAIL: spec:src/routes/api/bookings/+server.ts
      — Booking API route stays a thin adapter over createBooking and injected repositories.

It named the exact file and the exact contract I broke — not a vague lint warning, but "you bypassed the verified booking use case." Restore the delegation to the module, and:

$ microservices check
Template checks: pass

Green. The slot-conflict protection is back where it belongs.

Forget my tool for a second — the transferable idea is this:

The fix for plausible-but-wrong isn't a smarter model. It's a boundary your agent can't cross without a named, machine-readable failure.

Three moves you can apply on any stack:

You can roll this yourself with a test file and a grep. I happen to ship it as a contract + check

for Cloudflare apps — but the move is the move.

I ran the scaffold → contract → check

→ break → fix loop above for real. The parts that need your own machine — npm install

, npm run dev

, a deploy — are yours to run; I'm not going to claim outputs I didn't produce:

npm create microservices-app@latest booking-demo -- --template booking-sveltekit
cd booking-demo && npm install
npm run microservices -- check     # the gate — wire it into your agent loop
npm run dev

(If you ship apps for clients on Cloudflare, the same gate is what lets you hand the result to a security review without the 2am call — but that's a different post.)

Repo + the rest of the modules: https://microservices.sh

Genuinely curious: how are you keeping your agent from quietly rewriting the dangerous 30%? Contract tests, review checklists, just vibes? What's caught a plausible-but-wrong change for you — and what slipped through?

source & further reading

dev.to — original article What Is Model Context Protocol (MCP)? Building My First AI Registration chatbot Same DeepSeek V4 Flash, Different Agent: Why the Runtime Changes the Result

~/api · this article 200

$curl api.wpnews.pro/v1/news/ai-doesn-t-write-bad-cod…

Read original on dev.to → dev.to/favcrm/ai-doesnt-write-bad-code-it-writes…

mentioned entities

Claude Code

Cursor

Codex

Cloudflare

SvelteKit

Veracode

create-microservices-app

metadata

slugai-doesn-t-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai

topic#developer-tools

secondary4 topics

sentimentneutral

canonicaldev.to

navigation

← prevMy AI agent got dumber mid-sessi…

next →Tokenomics Foundation: Linux Fou…

── more in #developer-tools 4 stories · sorted by recency

startupfortune.com · 1 Aug · #developer-tools

Supabase Open-Sources Evals to Grade Claude Code, Codex and OpenCode

promptcube3.com · 1 Aug · #developer-tools

AI pair programming, what is prompt injection

dev.to · 1 Aug · #developer-tools

Same DeepSeek V4 Flash, Different Agent: Why the Runtime Changes the Result

dev.to · 1 Aug · #developer-tools

Code First, Specs After: A Practical Guide to AI-Driven Development

── more on @claude code 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 1 Aug · #ai-agents

Quality Isn't Accidental — Maker/Checker Separation and Automated Validation

wpnews · 1 Aug · #developer-tools

I Built a Portable AI Skill That Safely Upgrades .NET Applications

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required