# AI doesn't write bad code. It writes plausible code — so I tried to break my own AI-built app

> Source: <https://dev.to/favcrm/ai-doesnt-write-bad-code-it-writes-plausible-code-so-i-tried-to-break-my-own-ai-built-app-1307>
> Published: 2026-06-17 02:09:47+00:00

Disclosure: I work on one of the tools in this post (

`create-microservices-app`

). But the experiment, commands, and outputs below are real, and thepatternat the end works no matter what stack you're on — that's the part I actually want you to take.

If you ship with Claude Code, Cursor, or Codex, you know the feeling. The agent gets you **70% of the way** in minutes. It compiles. The diff looks reasonable. You merge it.

And then there's the quiet doubt: *did it actually get the hard 30% right* — auth boundaries, payments, tenant isolation, the booking logic that stops two people taking the same slot? Because AI doesn't usually write *obviously* bad code. It writes **plausible** code. And plausible-but-wrong is the expensive kind — it passes review and breaks in production on day three.

(The data backs the doubt: 84% of devs use AI tools, only **29% trust the output**, and 45% of AI-generated apps ship an exploitable vulnerability — Veracode, 2025.)

So I ran an experiment: build a real app with an agent, then **deliberately make the mistake an agent makes every day**, and see what — if anything — catches it.

```
npm create microservices-app@latest booking-demo -- --template booking-sveltekit
```

A full Cloudflare SvelteKit booking app — public flow, admin, D1, auth. The detail that matters for this experiment: it ships **its own contract** into the repo — `README.agent.md`

, `docs/api-boundary.md`

, and an executable spec, `microservices.check.mjs`

. The layering rule is one line: *routes are thin adapters; domain logic lives in verified modules, not in your handlers.*

Baseline:

``` bash
$ microservices check
Template checks: pass
```

The request an agent gets constantly: *"simplify the bookings endpoint."* So I did the eager-agent thing — inlined the write straight to the DB and dropped the module:

``` js
// src/routes/api/bookings/+server.ts — the "simplified" version
export const POST: RequestHandler = async ({ request, locals }) => {
  const body = await request.json();
  await locals.bookingRepository.insert({
    serviceId: body.serviceId,
    startsAt: body.startsAt,
    customerId: body.customerId
  });
  return json({ ok: true });
};
```

It type-checks. It runs. It would pass review. And it silently drops the slot-conflict guard the verified `createBooking`

use case enforced — a double-booking waiting to happen. Classic plausible-but-wrong.

Then I ran the check:

``` bash
$ microservices check
Error: One or more generated app checks failed.

$ microservices check --json
FAIL: spec:src/routes/api/bookings/+server.ts
      — Booking API route stays a thin adapter over createBooking and injected repositories.
```

It named the **exact file** and the **exact contract** I broke — not a vague lint warning, but "you bypassed the verified booking use case." Restore the delegation to the module, and:

``` bash
$ microservices check
Template checks: pass
```

Green. The slot-conflict protection is back where it belongs.

Forget my tool for a second — the transferable idea is this:

**The fix for plausible-but-wrong isn't a smarter model. It's a boundary your agent can't cross without a named, machine-readable failure.**

Three moves you can apply on any stack:

You can roll this yourself with a test file and a grep. I happen to ship it as a contract + `check`

for Cloudflare apps — but the move is the move.

I ran the scaffold → contract → `check`

→ break → fix loop above for real. The parts that need your own machine — `npm install`

, `npm run dev`

, a deploy — are yours to run; I'm not going to claim outputs I didn't produce:

```
npm create microservices-app@latest booking-demo -- --template booking-sveltekit
cd booking-demo && npm install
npm run microservices -- check     # the gate — wire it into your agent loop
npm run dev
```

(If you ship apps *for clients* on Cloudflare, the same gate is what lets you hand the result to a security review without the 2am call — but that's a different post.)

Repo + the rest of the modules: [https://microservices.sh](https://microservices.sh)

**Genuinely curious:** how are you keeping your agent from quietly rewriting the dangerous 30%? Contract tests, review checklists, just vibes? What's caught a plausible-but-wrong change for you — and what slipped through?
