{"slug": "uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would", "title": "UXRay: I Built an AI That Roasts Your UI Like a Senior Designer Would", "summary": "Based solely on the provided text, here is a 2-3 sentence factual summary:\n\nUXRay is a tool that uses the Gemma 4 E4B AI model to provide instant, structured UX audits of user interfaces from a screenshot or URL. It analyzes the UI against established heuristics like Nielsen's principles and WCAG 2.1, returning a score, friction points, and recommendations. The tool is designed to run locally on a developer's CPU without a GPU, completing an audit in about 56 seconds.", "body_md": "*This is a submission for the Gemma 4 Challenge: Build with Gemma 4*\n\n## What I Built\n\n**UXRay** — drop a screenshot or paste a URL, get a full UX audit in seconds.\n\nMost designers and developers ship UIs without a systematic critique. Hiring a UX consultant is expensive. Running a full user study takes weeks. UXRay closes that gap: it gives you the same structured, heuristic-based analysis a senior UX professional would produce — instantly, locally, and for free.\n\nYou give UXRay a UI (file upload or live URL) and it returns:\n\n-\n**Overall UX score**(0–100) -\n**Cognitive load analysis**— is the interface overwhelming users? -\n**Trust score**— what signals build or erode credibility? -\n**Friction points**— specific elements causing drop-off, each mapped to a Nielsen heuristic and rated critical / warning / info -\n**Prioritized recommendations**— actionable fixes sorted by urgency with effort and impact ratings -\n**Accessibility flags**— WCAG 2.1 violations visible in the screenshot -\n**Layout analysis**— fold content, visual hierarchy strength, whitespace quality, and scan pattern (Z vs F)\n\nThe analysis is grounded in established UX theory: Nielsen's 10 Usability Heuristics, Gestalt principles, Fogg's trust heuristics, Sweller's cognitive load theory, and WCAG 2.1. Every friction point cites the exact heuristic it violates so you know *why* something is a problem, not just *that* it is.\n\n**Stack:** Next.js 16 (App Router, TypeScript) · Tailwind v4 · Framer Motion · Gemma 4 E4B via Ollama · Playwright microservice for URL screenshots · Zod for structured output validation\n\n## Demo\n\nLive test: I pointed UXRay at dev.to. It captured a full-page screenshot, ran the Gemma 4 analysis, and returned a structured result — 85 overall score, 3 friction points, 3 prioritized recommendations — in about 56 seconds on CPU, no GPU required.\n\n## Code\n\n# UXRay — AI-Powered UX Analysis\n\nX-ray your interface through AI. Powered by Gemma 4 E4B.\n\nUXRay analyzes any UI screenshot like a behavioral psychologist — detecting cognitive load, trust signals, friction points, and actionable redesign recommendations. It uses **Gemma 4's native multimodal vision** to *see* the interface directly, not just process text descriptions.\n\nBuilt for the **Google Gemma 2026 Hackathon** on dev.to.\n\n## Demo\n\nUpload a screenshot or paste a URL → Gemma 4 analyzes it → structured UX critique appears:\n\n-\n**Overall UX Score**(0–100) -\n**Cognitive Load** gauge with specific issues -\n**Trust Score** with positive/negative signals -\n**Friction Points** with heuristic references (Nielsen, Gestalt, WCAG) -\n**Recommendations** sorted by priority with effort/impact ratings -\n**Accessibility Flags** and Layout Analysis\n\n## Prerequisites\n\n-\n**Ollama** installed and running:\n\n```\nbrew install ollama\nbrew services start ollama\n```\n\n-\n**Gemma 4 E4B** pulled:\n\n```\nollama pull gemma4:e4b\n```\n\n-\n**Node.js 18+**\n\n## Setup\n\n```\n# Clone the repo\ngit clone <repo-url>\ncd uxray\n# Install\n```\n\n…\n\nThe two key pieces of the pipeline:\n\n**1. Gemma 4 client ( web/lib/gemma.ts)**\n\nSends the screenshot as a raw base64 image to Ollama's `/api/generate`\n\nendpoint with `format: \"json\"`\n\nenforced, streams the NDJSON response token-by-token, and validates the output against a strict Zod schema. If JSON parsing fails on the first pass, it automatically retries at a lower temperature (0.1) to coax a clean response.\n\n``` js\nconst response = await fetch(`${OLLAMA_BASE_URL}/api/generate`, {\n  method: \"POST\",\n  headers: { \"Content-Type\": \"application/json\" },\n  body: JSON.stringify({\n    model: \"gemma4:e4b\",\n    prompt: SYSTEM_PROMPT + \"\\n\\n\" + USER_PROMPT,\n    images: [base64Image],   // raw base64, no data URI prefix\n    format: \"json\",          // enforces valid JSON output\n    stream: true,\n    options: {\n      temperature: 0.3,\n      num_ctx: 8192,\n    },\n  }),\n});\n```\n\n**2. Playwright screenshot service ( playwright-service/server.js)**\n\nA small Express server that accepts a URL, spins up Chromium, captures a full-page screenshot, and returns it as base64. This lets UXRay analyze any live site without leaving the local pipeline.\n\nTo run it yourself:\n\n```\n# Pull the model first\nollama pull gemma4:e4b\n\n# Start both services (Next.js on :3000, Playwright on :3001)\nnpm install && npm run dev\n```\n\n## How I Used Gemma 4\n\nI chose **Gemma 4 E4B** (the 4-billion-parameter multimodal variant) for three reasons:\n\n### 1. Multimodal vision is load-bearing, not decorative\n\nUXRay's entire value proposition requires *seeing* the UI. The model has to identify specific elements — button labels, color contrast, spacing, typography — and reason about them in relation to UX principles. Gemma 4's vision capability handles this natively. There's no separate OCR step, no layout parsing pipeline, no element segmentation — the model just looks at the screenshot and reasons.\n\n### 2. E4B runs on CPU in a reasonable time\n\nThe 4B parameter count was a deliberate choice. I wanted UXRay to work on a developer's laptop without requiring a GPU. At ~56 seconds for a full audit on CPU, E4B hits the sweet spot: thorough enough to produce genuinely useful output, fast enough to feel interactive. The 31B Dense model would have been overkill for a local-first tool, and E2B felt too thin for the reasoning depth the structured output requires.\n\n### 3. JSON mode + structured output validation\n\nSetting `format: \"json\"`\n\nin the Ollama request pushes Gemma 4 to emit valid JSON directly, which I then validate with a Zod schema. The system prompt defines the exact schema — `frictionPoints`\n\n, `cognitiveLoad`\n\n, `trustScore`\n\n, `layoutAnalysis`\n\n— and the model follows it reliably. This makes the output directly renderable in the UI with zero post-processing.\n\nThe system prompt grounds every analysis in specific UX frameworks so the model doesn't just describe what it sees — it diagnoses *why* it's a problem and cites the principle being violated:\n\n```\nYou are UXRay, an expert UX analyst with deep knowledge of:\n- Nielsen's 10 Usability Heuristics\n- Gestalt principles of visual design\n- WCAG 2.1 accessibility guidelines\n- Cognitive load theory (Sweller)\n- Trust and credibility heuristics (Fogg's Persuasive Technology)\n- Conversion rate optimization (CRO)\n```\n\nA real friction point from the dev.to analysis looks like this:\n\n```\n{\n  \"id\": \"fp-1\",\n  \"location\": \"Primary CTA button\",\n  \"description\": \"Button label 'Get started' is generic — users cannot predict what commitment they're making, increasing hesitation at the conversion moment.\",\n  \"severity\": \"warning\",\n  \"heuristic\": \"Nielsen #6 — Recognition over recall\"\n}\n```\n\nGemma 4's ability to follow a complex, multi-section JSON schema while simultaneously reasoning about visual design principles across a real screenshot is what makes this whole approach viable. Swap it for a text-only model and UXRay doesn't exist.\n\n*Built with Gemma 4 E4B + Ollama + Next.js 16. Runs fully local — your screenshots never leave your machine.*", "url": "https://wpnews.pro/news/uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would", "canonical_source": "https://dev.to/pulkitgovrani/uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would-2gfl", "published_at": "2026-05-23 11:19:45+00:00", "updated_at": "2026-05-23 11:32:28.015191+00:00", "lang": "en", "topics": ["artificial-intelligence", "developer-tools", "products", "machine-learning", "large-language-models"], "entities": ["UXRay", "Gemma 4", "Ollama", "Playwright", "Zod", "Nielsen", "Fogg", "Sweller"], "alternates": {"html": "https://wpnews.pro/news/uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would", "markdown": "https://wpnews.pro/news/uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would.md", "text": "https://wpnews.pro/news/uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would.txt", "jsonld": "https://wpnews.pro/news/uxray-i-built-an-ai-that-roasts-your-ui-like-a-senior-designer-would.jsonld"}}