Turn any novel into a playable browser game in 30 minutes — meet novel-game skill ModelStudioAI released a new Skill that transforms any novel or short story into a fully playable browser game in about 30 minutes. The novel-game Skill scaffolds a React single-page application with branching scenes, AI-generated portraits, cinematic cutscenes, optional text-to-speech narration, procedural Web Audio music, and save slots. It is the first Skill to combine video, image, speech, procedural audio, and frontend engineering into a single pipeline that outputs a runnable web app. A new ModelStudio Skill ships today — feed it any novel or short story and it scaffolds a full React SPA with branching scenes, AI-generated portraits, cinematic cutscenes, optional TTS narration, procedural Web Audio music, and save slots. Then you open localhost and play.Repo: github.com/modelstudioai/skills Skill path: skills/novel-game/ By ModelStudioAI In one sentence: hand your Agent a novel, get back a playable browser game you can share with a friend. This is not "generate a plot synopsis and call it interactive". It is not "here is a prompt, go paste it into some web UI". The novel-game skill literally: bl CLI — character portraits as 5-second looping videos or "breathing" still images, cutscenes as 1080p video or Ken-Burns-style stills, narration via TTS, BGM and SFX procedurally synthesized through Web Audio no external audio files ; public/assets/ so the running game makes A typical run produces 8–10 scenes, 3–5 branching endings, 6–8 AI portraits, 5–8 AI cutscenes, a procedural soundtrack, and three manual save slots plus autosave. It is the first true end-to-end demo of the bl multimodal stack. Earlier first-party skills lean on a single modality — docs lookup, prompt studio, financial agent, single-shot short-form video. novel-game is the first that wires video + image + speech + procedural audio + frontend engineering into a single pipeline whose output you can click on. It pushes "Agent Skill" from tool-call to product delivery. Most skills today expose an API. This one delegates a creative pipeline and hands back a runnable web app. That is a different abstraction altogether. It is the cleanest "show, don't tell" demo we have. Walking a stakeholder through a live run beats ten slides — the Agent literally turned a novel into a game while we watched . The full version lives in SKILL.md https://github.com/modelstudioai/skills/blob/main/skills/novel-game/SKILL.md . Condensed: 1 · Requirements . The Agent fires a single AskUserQuestion with seven decisions: source material EPUB / TXT / freeform prompt , game type visual novel / text adventure / text RPG , UI style pixel / cyberpunk / ink-wash / minimal , narrative POV, asset mode video / image / hybrid — hybrid recommended , audio mode BGM only / +SFX / +TTS narration , target duration 15 / 30 / 60+ minutes . 2 · Story design . Extract 1–3 main lines, 3–5 branching choice points, 3–5 endings, 6–8 characters, 5–8 cutscene moments, plus unlockable codex entries. 3 · Project scaffold . npx create-react-app with the canonical layout: components/ , data/ , hooks/ , styles/ , scripts/ , public/assets/ . 4 · Data model . A single story.js describes the full scene graph — branching choices, flag mutations, cutscene triggers, codex unlocks, ending conditions. A sibling generated-assets.json indexes every AI-generated asset with its local path and type video / image / mp3 . 5 · Implementation patterns . Typewriter text via timed setInterval 40–50ms per char , choice panel with hover affordances, hash routing so any chapter is a deep link, localStorage -backed autosave plus three manual slots, portrait component that auto-detects video vs image, cutscene component with Ken Burns for stills, mobile-aware touch targets ≥44px and touch-action: manipulation , lazy video loading with explicit memory release on unmount. 6 · Asset generation . scripts/generate-assets.sh drives the bl CLI: bl video generate --download portrait or cutscene as 5s 720p video bl image generate portrait or background, 768x1024 / 1920x1080 bl speech synthesize --voice longxiaochun TTS narration bl video ref multi-reference video for character consistency Video jobs take 2–5 minutes each, so the script submits in --async mode with 3–5 concurrent jobs, then batch-downloads. Roughly a 4× wall-time speedup vs sequential. Assuming a 30-minute / 15–18 scene mid-tier run in hybrid asset mode: | Item | Count | Unit | Subtotal | |---|---|---|---| | Character portraits image | 8 × 768×1024 | bl image generate | cents range | | Key cutscenes 720p video | 5 × 5s | ¥0.9/s × 5s × 5 | ≈ ¥22.5 | | Scene backgrounds image | 8 × 1920×1080 | bl image generate | cents range | | TTS narration optional | 15 × ~30s | bl speech synthesize | single-digit ¥ | Authoritative pricing is on the ModelStudio console. New accounts get free credits — plenty to walk the demo end-to-end. The headline: roughly a cup of coffee, and you have your own playable visual novel. 1. Install the skill: npx skills add modelstudioai/skills Pick novel-game from the prompt or --all to grab the whole bundle . 2. Wire up bl and an API key: npm i -g bailian-cli bl auth login Grab a key at bailian.console.aliyun.com https://bailian.console.aliyun.com/cn-beijing/?source channel=key github&tab=app /api-key — new accounts get free credits. 3. Ask your favorite Agent Claude Code / Qoder / Cursor / Cline / … in natural language: Adapt the Ye Wenjie arc from "The Three-Body Problem" into a visual novel, ink-wash style, 30-minute playtime, hybrid assets. The Agent owns the rest: ask the seven decisions, design the branches, scaffold the React project, write the code, generate every asset, and npm start . Prompt safety . Video prompts containing weapons, smoking, or explicit violence get rejected. The skill ships a built-in rewrite table — content lessons distilled from dozens of real runs. Procedural BGM . Music is synthesized live via Web Audio, not pre-baked MP3s. Fixed MIDI pitch arrays, multi-voice layering, convolution reverb, ADSR envelopes, detuned pad sustains. Sounds composed; ships at zero file size. Mobile gotchas . iOS Safari auto-zooms inputs under 16px. touch-action: manipulation to kill the 300ms click delay. env safe-area-inset-bottom for the home indicator. Dual-bind click + touchend to recover responsiveness on edge browsers. All wired in by default. Video memory release . When leaving a scene, video.pause ; video.removeAttribute 'src' ; video.load ; is required — without it the WebView leaks frame buffers and a 30-minute play session ends in jank. Our bar for first-party skills is one line: someone shipped real output with it, and the design is worth copying. novel-game clears both bars. Author @lishengzxc https://github.com/lishengzxc used it to produce a complete novel-to-game adaptation — not a screenshot demo, an actual React project people have played. The pitfall guide in SKILL.md is paid-for-in-real-time wisdom. If you build content, games, or interactive narratives, this is a low-cost weekend to spend. Share what you build over at Issues https://github.com/modelstudioai/skills/issues , or open a PR if you spot a missing tip in the SKILL. Repo · github.com/modelstudioai/skills https://github.com/modelstudioai/skills Skill · skills/novel-game/ Try the model · ModelStudio HappyHorse 1.1 playground https://bailian.console.aliyun.com/cn-beijing?tab=demohouse&source channel=hh github /experience/t2v — ModelStudioAI https://github.com/modelstudioai on GitHub