cd /news/generative-ai/turn-any-novel-into-a-playable-brows… · home topics generative-ai article
[ARTICLE · art-38920] src=dev.to ↗ pub= topic=generative-ai verified=true sentiment=↑ positive

Turn any novel into a playable browser game in 30 minutes — meet novel-game skill

ModelStudioAI released a new Skill that transforms any novel or short story into a fully playable browser game in about 30 minutes. The novel-game Skill scaffolds a React single-page application with branching scenes, AI-generated portraits, cinematic cutscenes, optional text-to-speech narration, procedural Web Audio music, and save slots. It is the first Skill to combine video, image, speech, procedural audio, and frontend engineering into a single pipeline that outputs a runnable web app.

read5 min views1 publishedJun 25, 2026

A new ModelStudio Skill ships today — feed it any novel or short story and it scaffolds a full React SPA with branching scenes, AI-generated portraits, cinematic cutscenes, optional TTS narration, procedural Web Audio music, and save slots. Then you open

localhost

and play.Repo:

[github.com/modelstudioai/skills]

Skill path:skills/novel-game/

By[ModelStudioAI]

In one sentence: hand your Agent a novel, get back a playable browser game you can share with a friend.

This is not "generate a plot synopsis and call it interactive". It is not "here is a prompt, go paste it into some web UI". The novel-game

skill literally:

bl

CLI — character portraits as 5-second looping videos or "breathing" still images, cutscenes as 1080p video or Ken-Burns-style stills, narration via TTS, BGM and SFX procedurally synthesized through Web Audio (no external audio files);public/assets/

so the running game makes A typical run produces 8–10 scenes, 3–5 branching endings, 6–8 AI portraits, 5–8 AI cutscenes, a procedural soundtrack, and three manual save slots plus autosave.

It is the first true end-to-end demo of the bl multimodal stack. Earlier first-party skills lean on a single modality — docs lookup, prompt studio, financial agent, single-shot short-form video.

novel-game

is the first that wires video + image + speech + procedural audio + frontend engineering into a single pipeline whose output you can click on.It pushes "Agent Skill" from tool-call to product delivery. Most skills today expose an API. This one delegates a creative pipeline and hands back a runnable web app. That is a different abstraction altogether.

It is the cleanest "show, don't tell" demo we have. Walking a stakeholder through a live run beats ten slides — the Agent literally turned a novel into a game while we watched.

The full version lives in SKILL.md. Condensed:

1 · Requirements. The Agent fires a single AskUserQuestion

with seven decisions: source material (EPUB / TXT / freeform prompt), game type (visual novel / text adventure / text RPG), UI style (pixel / cyberpunk / ink-wash / minimal), narrative POV, asset mode (video / image / hybrid — hybrid recommended), audio mode (BGM only / +SFX / +TTS narration), target duration (15 / 30 / 60+ minutes).

2 · Story design. Extract 1–3 main lines, 3–5 branching choice points, 3–5 endings, 6–8 characters, 5–8 cutscene moments, plus unlockable codex entries.

3 · Project scaffold. npx create-react-app

with the canonical layout: components/

, data/

, hooks/

, styles/

, scripts/

, public/assets/

.

4 · Data model. A single story.js

describes the full scene graph — branching choices, flag mutations, cutscene triggers, codex unlocks, ending conditions. A sibling generated-assets.json

indexes every AI-generated asset with its local path and type (video

/ image

/ mp3

).

5 · Implementation patterns. Typewriter text via timed setInterval

(40–50ms per char), choice panel with hover affordances, hash routing so any chapter is a deep link, localStorage

-backed autosave plus three manual slots, portrait component that auto-detects video vs image, cutscene component with Ken Burns for stills, mobile-aware touch targets (≥44px) and touch-action: manipulation

, lazy video with explicit memory release on unmount.

6 · Asset generation. scripts/generate-assets.sh

drives the bl

CLI:

bl video generate --download   # portrait or cutscene as 5s 720p video
bl image generate              # portrait or background, 768x1024 / 1920x1080
bl speech synthesize --voice longxiaochun   # TTS narration
bl video ref                   # multi-reference video for character consistency

Video jobs take 2–5 minutes each, so the script submits in --async

mode with 3–5 concurrent jobs, then batch-downloads. Roughly a 4× wall-time speedup vs sequential.

Assuming a 30-minute / 15–18 scene mid-tier run in hybrid asset mode:

Item Count Unit Subtotal
Character portraits (image) 8 × 768×1024 bl image generate
cents range
Key cutscenes (720p video) 5 × 5s ¥0.9/s × 5s × 5 ≈ ¥22.5
Scene backgrounds (image) 8 × 1920×1080 bl image generate
cents range
TTS narration (optional) 15 × ~30s bl speech synthesize
single-digit ¥

Authoritative pricing is on the ModelStudio console. New accounts get free credits — plenty to walk the demo end-to-end.

The headline: roughly a cup of coffee, and you have your own playable visual novel.

1. Install the skill:

npx skills add modelstudioai/skills

Pick novel-game

from the prompt (or --all

to grab the whole bundle).

2. Wire up bl

and an API key:

npm i -g bailian-cli
bl auth login

Grab a key at bailian.console.aliyun.com — new accounts get free credits.

3. Ask your favorite Agent (Claude Code / Qoder / Cursor / Cline / …) in natural language:

Adapt the Ye Wenjie arc from "The Three-Body Problem" into a visual novel,
ink-wash style, 30-minute playtime, hybrid assets.

The Agent owns the rest: ask the seven decisions, design the branches, scaffold the React project, write the code, generate every asset, and npm start

.

Prompt safety. Video prompts containing weapons, smoking, or explicit violence get rejected. The skill ships a built-in rewrite table — content lessons distilled from dozens of real runs.

Procedural BGM. Music is synthesized live via Web Audio, not pre-baked MP3s. Fixed MIDI pitch arrays, multi-voice layering, convolution reverb, ADSR envelopes, detuned pad sustains. Sounds composed; ships at zero file size.

Mobile gotchas. iOS Safari auto-zooms inputs under 16px. touch-action: manipulation

to kill the 300ms click delay. env(safe-area-inset-bottom)

for the home indicator. Dual-bind click

  • touchend

to recover responsiveness on edge browsers. All wired in by default.

Video memory release. When leaving a scene, video.(); video.removeAttribute('src'); video.load();

is required — without it the WebView leaks frame buffers and a 30-minute play session ends in jank.

Our bar for first-party skills is one line: someone shipped real output with it, and the design is worth copying.

novel-game

clears both bars. Author @lishengzxc used it to produce a complete novel-to-game adaptation — not a screenshot demo, an actual React project people have played. The pitfall guide in SKILL.md

is paid-for-in-real-time wisdom.

If you build content, games, or interactive narratives, this is a low-cost weekend to spend.

Share what you build over at Issues, or open a PR if you spot a missing tip in the SKILL.

Repo · github.com/modelstudioai/skills

Skill · skills/novel-game/

Try the model · ModelStudio HappyHorse 1.1 playground

ModelStudioAI on GitHub

── more in #generative-ai 4 stories · sorted by recency
── more on @modelstudioai 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/turn-any-novel-into-…] indexed:0 read:5min 2026-06-25 ·