I built a roguelike whose dungeon master is an LLM running 100% in the browser A developer built a roguelike game where the dungeon master is a large language model that runs entirely in the player's browser using WebLLM and WebGPU, eliminating server costs and enabling offline play. The architecture separates narrative generation from game logic to prevent the LLM from breaking rules, such as detecting player death through integer health tracking instead of prose analysis. The system supports multiple genres via configurable settings, allowing new game types to be added without code changes. Most "AI games" phone home. Every turn is an API round-trip, every player burns your tokens, and the whole thing dies the day the bill scares you. I wanted the opposite: a text roguelike where the dungeon master is an LLM that runs entirely in the player's browser — no server, no API key, no per-token cost, and it keeps working offline after the first load. Here's the architecture and the one bug that taught me the most. WebLLM https://github.com/mlc-ai/web-llm compiles quantized models to WebGPU, so inference runs on the player's GPU. There is no backend at all. js const cdn = "https://esm.run/@mlc-ai/web-llm"; const webllm = await import / webpackIgnore: true / cdn ; const engine = await webllm.CreateMLCEngine MODEL ID, { initProgressCallback: p = setLoading p.text , } ; First load pulls the weights once the browser caches them . After that every turn is local and free. A dungeon master should be creative, but it must not be allowed to break the rules. The split that made it stable: My most instructive bug: early on I let the prose drive death detection regex for "you die" , and the model cheerfully killed players on turn one with pure flavor text — "this could be the end of you" → game over. Moving death to an integer the engine owns if hp <= 0 fixed it instantly. Rule of thumb: the LLM writes the story; your code keeps the score. The tradeoff is model size: you run something small enough to load in a tab, so prompt design carries real weight. For a narrative game master that's a fair trade. Genre is just a config object — palette, HUD labels, seed scenarios, system prompt. Same engine, swap the config, ship a different game. Adding a genre is data , not a code change, which means a generator can author new ones. If you want to poke at a live one, the cyberpunk build NeonHeist and a few others are up under Games at bestpaid.app https://bestpaid.app — all running on-device. Happy to go deeper on the JSON-contract prompt or the WebGPU loading UX in the comments.