cd /news/artificial-intelligence/single-page-claude-writes-beautifull… · home topics artificial-intelligence article
[ARTICLE · art-32157] src=dev.to ↗ pub= topic=artificial-intelligence verified=true sentiment=· neutral

Single-page Claude writes beautifully. At 5 pages it drifts. Here's the harness I built.

A developer built a harness with 14 gates, auto-retry, and handoff JSON to stop multi-page drift when generating React apps from Figma files using LLMs like Claude and Codex. The harness achieved a 100% build-green rate across 10 demos and 54 screens in four business domains. The developer identified four specific drift modes in multi-page output: world-state re-invention, route hallucination, store-key decay, and silent build failures.

read3 min views1 publishedJun 18, 2026

I gave Claude / Codex a Figma file + a PRD and asked for 5-10 React pages of a working app. Single-page output is great. Multi-page output drifts in 4 specific ways. I spent ~3 months building a harness with 14 gates × auto-retry × handoff JSON to stop the drift. 10 demos, 54 screens, 4 unrelated business domains, build-green rate 100%.

Code: https://github.com/JiuwenDragon/harness-mini

Every "Figma to code with AI" demo on Twitter shows one screen. That's a real result — Claude vision is genuinely good at single-page UI. I verified this many times during my research: giving Claude a screenshot + a paragraph of PRD produces a 70-80 point page in 30 seconds.

The promise breaks at 5+ screens. Here are the 4 drift modes I measured.

Screen 1 Screen 2 Screen 3
Username: Zhang San Username: Li Si Username: Test User

LLM doesn't carry a "world state" across page generations. Without explicit injection, it re-invents.

// Screen "transfer" generated:
<button onClick={() => router.push("/banking/home")}>  // ← /banking
// Screen "home" actually at:
app/bank/home/page.tsx                                  // ← /bank

Single-page review never catches this. Click-through breaks.

A zustand store with 5 keys (user, balance, lastTx, recent[], selected). LLM forgets 2-3 keys on screen 4, makes new ones up. Same business concept, three different variable names.

> Codex: All 10 pages generated, ready to preview.
> me: npm run build
> 3 pages: red. 2 pages: empty <div /> stubs. 1 page: import path wrong.

This one is the most painful. Without an external check, "claimed done" ≠ done.

Figma + PRD
    ↓ intake (fixture split)
    ↓ contract (frozen spec)
    ↓ generate (codex / claude / gemini)
    ↓ 14 gates (semantic / PRD / spec / UI hygiene / build / cross-canvas)
    ↓ visual review (human)
    ↓ web-preview (clickable)

Each gate is scoped to one constraint. Why? See Constraint Decay paper (arXiv 2605.06445): stuffing 10+ constraints into one prompt drops LLM performance by 30 percentage points.

The retry loop: when a gate fails, the gate's structured error report (not a vague "try again") is fed back to the LLM. Reflexion-style.

The handoff: each stage emits *_status.json

so a new operator (or a new LLM session) can pick up without reading the conversation.

Constraint Decay (arXiv 2605.06445) measured the drop directly.

Lost in the Middle (arXiv 2307.03172) shows the LLM ignores constraints buried in long prompts.

So I push one check per gate, max ~3 constraints per LLM round.

Domain Color Screens Build pass
Banking Deep red 10 10/10
Fitness Orange 3 3/3
Travel Blue 3 3/3
Shoes Black 3 3/3

Same 14 gates. Same Codex/Claude/Gemini providers swapped via contract. No per-domain prompt tuning.

Tool Strength Why it's not what I needed
Builder.io Visual Copilot 2M+ training data, Mitosis IR SaaS, no PRD dim, no audit trail
Locofy LDM Large Design Model SaaS, design system requires strict Auto Layout
Figma Make Highest fidelity (EPAM benchmark)
No public API, browser-only, $16/mo seat
v0 (Vercel) Tight shadcn/Next.js Figma link silently downgrades to screenshot (loses metadata)

These are all great for "single dev makes a pretty page." None give me multi-page consistency + PRD enforcement + audit log + on-prem + provider swap, which is the actual enterprise need.

https://github.com/JiuwenDragon/harness-mini

scripts/

MIT license (I should add the file — open to PR).

Happy to answer questions in comments. The most useful feedback would be: "what other drift modes have you seen at >5 pages."

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @claude 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/single-page-claude-w…] indexed:0 read:3min 2026-06-18 ·