{"slug": "supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser", "title": "Supercharge your web app with free AI that runs in your users' browser", "summary": "Chrome now ships Gemini Nano, a language model accessible via the Prompt API, enabling on-device AI without API keys or inference costs. A developer integrated it into a free Mermaid diagram editor, using the model as a drafter and Mermaid's parser as a validator to ensure reliable output.", "body_md": "There is a class of feature that used to be impossible to ship for free: anything that needed a language model. You wired up an API key, you ate the per-token bill, and every prompt your users typed went off to someone else's server. For a small public tool, that math usually killed the idea before it started.\n\nThat changed. Recent versions of Chrome ship a language model, Gemini Nano, and expose it to any web page through the **Prompt API**. The model runs on the user's own machine. No API key. No inference bill. No data leaving the browser.\n\nWe put this into a real, live tool, a [free Mermaid diagram editor](https://bitvea.com/en/tools/mermaid-editor) where you describe a diagram in plain English and the browser writes the Mermaid code for you. This post is the developer's version of that story: how the API actually works, the code that makes a small on-device model trustworthy, and an honest accounting of what you gain and what you give up.\n\nThe important word is *built-in*. This is not WebGPU plus a 4 GB model you download and run yourself. The model ships with Chrome, and you talk to it through a small standard-track JavaScript API.\n\nAs of Chrome 148, the Prompt API is stable for web pages (it had been available to extensions since Chrome 138). It is the general-purpose member of a growing family of built-in APIs:\n\n`LanguageModel`\n\n): general natural-language prompting, now multimodal (text, plus image and audio input).The Prompt API is the one you reach for when you need something the task APIs don't cover, like \"turn this description into Mermaid source.\" So that is the one this post focuses on.\n\nHere is the whole happy path. Check availability, create a session, prompt it.\n\n```\n// Feature-detect first. Old browsers won't have this at all.\nif ('LanguageModel' in self) {\n  const status = await LanguageModel.availability();\n\n  if (status !== 'unavailable') {\n    const session = await LanguageModel.create();\n    const answer = await session.prompt('Explain event loops in one sentence.');\n    console.log(answer);\n    session.destroy();\n  }\n}\n```\n\nThat is it. No keys, no SDK, no network call. The first time an origin uses the model, Chrome downloads it; after that it is local and works offline.\n\n`availability()`\n\nis the gate you build your UI around. It returns one of four states:\n\n`\"unavailable\"`\n\n: the device can't run it (too little disk, no supported hardware, unsupported options).`\"downloadable\"`\n\n: supported, but the model needs downloading first. Requires a user gesture to start.`\"downloading\"`\n\n: a download is in progress.`\"available\"`\n\n: ready right now.Mermaid is a tiny text language: `A --> B`\n\nbecomes a flowchart. It's great once you know it, and forgettable if you only touch it monthly. The obvious fix is to let people describe the diagram and have the model write the Mermaid. The non-obvious part is making a *small* model's output trustworthy.\n\nGemini Nano is small. Prompt it for code and it will sometimes wrap the output in markdown fences, add a chatty preamble, or emit a diagram with a subtle syntax error. If you pipe that straight into your renderer, you ship a tool that breaks every fifth try.\n\nThe fix is to treat the model as a drafter and put a real validator in front of the user. Mermaid ships its own parser, so we use it as the source of truth:\n\n``` js\nconst clean = (s) => s.replace(/```\n{% endraw %}\n(?:mermaid)?/g, '').trim();\n\nasync function describeToMermaid(description) {\n  if ((await LanguageModel.availability()) === 'unavailable') return null;\n\n  const session = await LanguageModel.create({\n    initialPrompts: [{\n      role: 'system',\n      content:\n        'You write Mermaid diagram source. Output only valid Mermaid code. ' +\n        'No prose, no explanations, no markdown fences.',\n    }],\n  });\n\n  try {\n    let code = clean(await session.prompt({% raw %}`Create a Mermaid diagram: ${description}`{% endraw %}));\n\n    // Source of truth: Mermaid's own parser, not the model's confidence.\n    try {\n      await mermaid.parse(code);\n    } catch (err) {\n      // Exactly one self-correction pass. Hand the error back to the model.\n      code = clean(await session.prompt(\n        {% raw %}`That Mermaid failed to parse:\\n${err.message}\\n`{% endraw %} +\n        {% raw %}`Return corrected Mermaid only.`{% endraw %}\n      ));\n      await mermaid.parse(code); // still broken? this throws, caller handles it\n    }\n\n    return code;\n  } finally {\n    session.destroy(); // free the model; sessions are not free to hold open\n  }\n}\n{% raw %}\n```\n\nThat validate-and-retry loop is the difference between a demo and a tool. The model gets one chance to fix its own mistake. If it fails twice, we show a friendly message and leave the editor untouched rather than rendering garbage. The parser is the authority; the model is just a fast first draft.\n\nFor outputs that *are* structured, you don't have to hope. The Prompt API accepts a JSON Schema via `responseConstraint`\n\n, and the model is forced to match it:\n\n``` js\njs\nconst schema = { type: 'boolean' };\n\nconst result = await session.prompt(\n  `Is this text describing a sequence of steps?\\n\\n${input}`,\n  { responseConstraint: schema }\n);\nconsole.log(JSON.parse(result)); // true | false\n```\n\nMermaid source isn't cleanly expressible as JSON Schema, which is exactly why we lean on the parser instead. But for classification, extraction, or form-filling, structured output removes a whole category of cleanup code.\n\nThis is the part most people get wrong. On-device AI is a *bonus* for capable machines, not a baseline you can assume. So gate the feature, never the app.\n\nIn our editor, the entire tool, the live preview, themes, export, sharing, works in any modern browser. The Generate-from-text box only appears when the model reports itself usable. Everyone else sees a normal editor and never knows a feature was missing.\n\n```\njs\nasync function setupAI(generateButton) {\n  if (!('LanguageModel' in self)) return; // not Chrome, or too old\n\n  const status = await LanguageModel.availability();\n  if (status === 'unavailable') return;\n\n  generateButton.hidden = false;\n\n  generateButton.onclick = async () => {\n    // Model download needs a user gesture; this click is it.\n    const session = await LanguageModel.create({\n      monitor(m) {\n        m.addEventListener('downloadprogress', (e) => {\n          showProgress(Math.round(e.loaded * 100)); // multi-GB first time\n        });\n      },\n    });\n    // ...use the session...\n  };\n}\n```\n\nTwo details that bite people:\n\n`create()`\n\nmust run inside a real user gesture (a click, key press, tap). Calling it on page load throws. Check `navigator.userActivation.isActive`\n\nif you're unsure.`downloadprogress`\n\nand tell the user, or your \"Generate\" button looks frozen for minutes.This is genuinely a new capability, and for the right feature it's hard to beat:\n\nEqually honest, because this is where the \"just use it everywhere\" dream dies:\n\n`QuotaExceededError`\n\nand the `contextoverflow`\n\nevent), and as of Chrome 149 the language model targets English, Spanish, Japanese, German, and French.`temperature`\n\n/`topK`\n\nare extension-only for now, and it isn't available in Web Workers yet.The decision is mostly about whether the feature is essential or a bonus, and how sensitive the data is.\n\nReach for on-device AI when the feature can be progressive enhancement, when privacy is a real selling point, when the workload is small and frequent (classify, extract, rewrite, draft), and when you'd rather not run a backend at all. That describes a surprising amount of \"nice to have\" AI.\n\nStay server-side when the feature is core to every user, when you need a large or frontier model, when output quality must be consistent across all hardware, or when you need it on mobile and Safari today. And you don't have to choose forever: a common pattern is **hybrid**, run on-device when available and fall back to a cloud model otherwise. Chrome's docs cover a polyfill and a Firebase AI Logic fallback for exactly this.\n\nFor our Mermaid editor the choice was easy. The diagram generator is a bonus, the people who can run it get something delightful and private, and everyone else gets a fully working editor. Nobody hits a wall.\n\nOne detail that cost us an afternoon and might save you one. Exporting the diagram to PNG meant drawing it onto a hidden canvas, and in Chrome it kept failing with `Tainted canvases may not be exported.`\n\nThe cause: Mermaid was rendering text labels inside an embedded HTML element (a `foreignObject`\n\n), and the browser treats that as a security taint on the canvas, which blocks export. The fix was to configure Mermaid to render labels as real SVG `<text>`\n\ninstead of embedded HTML. Bonus: the text now survives PNG export cleanly and stays selectable in the SVG. If you ever see a tainted-canvas error on an export that looked entirely local, check for `foreignObject`\n\nfirst.\n\nThe [Mermaid editor is live and free](https://bitvea.com/en/tools/mermaid-editor). If you're on a recent desktop Chrome, describe a diagram and watch the browser write it, with nothing leaving your machine. If you're not, you still get a fast editor with live preview, themes, and export.\n\nThe broader point: a meaningful slice of the AI features you've been quoting backend costs for can now run for free in the client, with better privacy than your server ever offered. It won't fit every case, the hardware bar and browser support see to that, but when it fits, it fits beautifully.\n\nThis is the pragmatic view of AI we bring to client work, too. We're far more interested in [AI that quietly does a real job](https://bitvea.com/en/services/ai-agents) than AI as a headline, and in [custom software built around how you actually work](https://bitvea.com/en/services/custom-software). If you've got a workflow that needs its own small, sharp tool, we like that kind of problem.\n\n*Built by Bitvea. You handle business. We handle IT.*", "url": "https://wpnews.pro/news/supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser", "canonical_source": "https://dev.to/petr_patek_12/supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser-2l2m", "published_at": "2026-06-20 21:30:07+00:00", "updated_at": "2026-06-20 21:39:00.957844+00:00", "lang": "en", "topics": ["large-language-models", "ai-tools", "developer-tools"], "entities": ["Chrome", "Gemini Nano", "Prompt API", "Mermaid"], "alternates": {"html": "https://wpnews.pro/news/supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser", "markdown": "https://wpnews.pro/news/supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser.md", "text": "https://wpnews.pro/news/supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser.txt", "jsonld": "https://wpnews.pro/news/supercharge-your-web-app-with-free-ai-that-runs-in-your-users-browser.jsonld"}}