cd /news/generative-ai/vibe-coding-problems-7-visual-bugs-a… · home topics generative-ai article
[ARTICLE · art-13896] src=dev.to pub= topic=generative-ai verified=true sentiment=↓ negative

Vibe Coding Problems: 7 Visual Bugs AI Code Generators Always Ship

Testing by Jason Arbon found approximately 160 issues per AI-generated app, with the majority being visual bugs rather than functional errors. The seven most common categories include inconsistent spacing from Tailwind utility approximations, brand color substitutions, broken layouts at intermediate viewport widths, accessibility failures, missing font specifications, absent interactive states, and arbitrary z-index stacking. Arbon's analysis showed a p-value of 0.7199 between Bolt.new and Lovable bug counts, indicating no single AI tool significantly outperforms others due to the architectural limitation that models work with tokens, not pixels.

read3 min publishedMay 25, 2026

You shipped a Lovable app. It works. The buttons click, the forms submit, the data flows.

Then you open it on your phone and the hero section overlaps the nav. Or you squint at a button that's definitely not your brand blue. Or a screen reader announces nothing useful.

This isn't a "you" problem. Testing by Jason Arbon found approximately 160 issues per AI-generated app, and the majority aren't functional bugs. They're visual: layout, spacing, accessibility, color.

Here are the 7 categories that show up in every AI-generated codebase, regardless of which tool built it.

AI defaults to Tailwind utilities that approximate your design. A 24px spec becomes gap-4

(16px) in one place and `gap-8`

(32px) elsewhere.

The model picks the closest available class, not the correct value. Multiply this across 50+ components and your layout feels "off" without any single element being obviously wrong.

Brand colors outside standard palettes get substituted with nearest matches. Your brand blue (#2563EB

) appears as three different shades across buttons, links, and nav elements.

This happens because models infer color from context rather than referencing a single source of truth. Each generation pass introduces fresh approximations.

AI uses Tailwind defaults: 640px, 768px, 1024px, 1280px. Layouts break at intermediate widths like 834px (iPad portrait) or 900px (common laptop-with-sidebar viewport).

Unless you explicitly prompt for custom breakpoints, you'll only discover these gaps when a real user hits them.

The WebAIM Million study found 95.9% of homepages have WCAG failures, averaging 56.8 errors per page. AI-generated code underperforms even that baseline because models deprioritize semantic HTML and ARIA unless explicitly prompted.

Missing alt text, insufficient color contrast, unlabeled form inputs, broken focus order. These aren't edge cases. They're the default output.

AI generates text-base

(16px/24px) but ignores letter-spacing, font weights, or custom font imports. Your design might spec Inter at 500 weight with -0.02em tracking. You'll get system font at 400 weight with default tracking.

AI produces default component states but frequently omits hover, focus, active, and disabled variants. Buttons that don't respond to hover feel broken. Missing focus states make keyboard navigation impossible.

Z-index values lack a global stacking strategy. Modals render behind navbars. Tooltips clip behind adjacent sections. Dropdowns disappear under hero images.

Every component gets an arbitrary z-index instead of a coordinated system.

Arbon's testing showed a p-value of 0.7199 between Bolt.new and Lovable bug counts. Statistically equivalent. No single AI tool significantly outperforms others because the architectural limitation is universal: models work with tokens, not pixels.

They can't render output in a browser. They can't compare against a design file. They optimize for syntactic validity, not visual fidelity.

Iterative prompting ("fix this, now fix that") costs 3-5 million tokens per cycle and introduces new regressions with each pass.

Instead:

A SmartBear survey found 68% of teams say faster AI-assisted development creates testing bottlenecks. The bottleneck isn't the building. It's verifying what was built matches what was designed.

What visual bugs have you hit with AI code generators? Curious if others are seeing the same patterns.

── more in #generative-ai 4 stories · sorted by recency
── more on @lovable 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/vibe-coding-problems…] indexed:0 read:3min 2026-05-25 ·