cd /news/artificial-intelligence/voilaa-turning-any-youtube-video-int… · home topics artificial-intelligence article
[ARTICLE · art-41769] src=dev.to ↗ pub= topic=artificial-intelligence verified=true sentiment=↑ positive

Voilaa! — Turning Any YouTube Video into an Interactive Learning App with Google Gemini

A developer built Voilaa!, a full-stack educational app that uses Google Gemini AI to transform any YouTube video into an interactive learning experience. The app generates live quizzes, flashcards, and simulators by running a two-stage AI chain: first, a pedagogist model analyzes the video and outputs a structured spec; second, a code generation model produces a single-file HTML/CSS/JS application. The project highlights the importance of prompt engineering, particularly the constraint that the spec must be implementable by a junior developer in one HTML file.

read4 min views1 publishedJun 27, 2026

This post is my submission for DEV Education Track: Build Apps with Google AI Studio.

Voilaa! is a full-stack educational playground that transforms any YouTube video into a rich, interactive learning experience — think live quizzes, flashcard decks, formula simulators, and data visualizations — all generated on-the-fly by Google Gemini AI.

The idea is simple: paste a YouTube URL, choose your academic depth, and within seconds Gemini analyzes the video's content and synthesizes a fully functional, self-contained interactive HTML learning app tailored to that exact lesson.

The magic is a two-stage AI chain running entirely on the server side:

The first model acts as a pedagogist. It watches the video and produces a structured JSON payload containing:

spec

flashcards

The prompt I crafted for this stage was the most important piece of the whole project:

You are a pedagogist and product designer with deep expertise in crafting 
engaging learning experiences via interactive web apps.

Examine the contents of the attached video. Then, provide the following in JSON:
1. "spec": A detailed spec for an interactive web app designed to complement 
   the video and reinforce its key ideas. The spec must be thorough and 
   self-contained (must not mention it is based on a video).
2. "flashcards": A list of at least 5 key terms and concise definitions 
   extracted from the video.

The goal of the app is to enhance understanding through simple and playful 
design. A junior web developer should be able to implement it in a single 
HTML file (with all styles and scripts inline). The spec must clearly outline 
the core mechanics, and those mechanics must be highly effective in reinforcing 
the video's key ideas.

The second model receives the spec and generates a pristine, single-file HTML/CSS/JS application — no frameworks, no external dependencies — ready to run inside a sandboxed iframe.

Layer Technology
Frontend React 18 + Vite (SPA)
Styling Tailwind CSS + motion animations
Code Editor Monaco Editor (same engine as VS Code)
Charts Recharts
Icons Lucide React
AI
@google/genai TypeScript SDK
Backend Node.js + Express 5
Runtime
tsx (direct TypeScript execution)

The Gemini API key lives exclusively on the server — never exposed to the client bundle.

Once a learning app is generated, users get a three-tab workspace:

A live sandboxed <iframe>

running the generated app — fully interactive, no page reload required.

A full Monaco Editor showing (and letting you edit) the raw generated HTML/JS/CSS. Any saved changes hot-reload the preview instantly.

Inspect or edit the curriculum blueprint produced by the Semantic Analyst — great for prompting a regeneration with tweaks.

There's also a Zen Mode (fades surrounding UI to focus on the lesson) and Fullscreen Mode for distraction-free study.

🔗

Live App →([https://voilaa-498153626537.us-west1.run.app/])

→ Gemini analyzes chord progressions, tension, and resolution

→ Generates an interactive piano simulator with chord-click feedback

→ Flashcard deck covers: Tonic, Dominant, Leading Tone, Cadence, Voice Leading

→ Gemini generates a step-by-step animated bubble sort / merge sort visualizer

→ Flashcards cover time complexity, in-place sorting, stability, etc.

I expected the hardest part to be the frontend sandbox mechanics. It wasn't. The hardest part was prompt engineering the Semantic Analyst.

Early versions of the spec prompt produced specs that were either too vague ("make an interactive quiz") or too ambitious ("build a multi-page React app with a backend"). The breakthrough was adding the constraint:

"A junior web developer should be able to implement it in a single HTML file."

This single sentence dramatically improved output quality — Gemini started producing specs with clearly scoped, concrete mechanics instead of wishful thinking.

Two-model chains unlock quality you can't get from one prompt. Separating "think about what to build" from "write the code" produced dramatically better results. The planning model could focus entirely on pedagogy; the coding model could focus entirely on implementation.

Temperature matters more than model choice for creative educational content. A temperature of ~0.75 produced the most varied and playful learning apps, while staying coherent.

Keeping the API key server-side is non-negotiable. Even for a hackathon demo, having Express proxy all Gemini calls protects your quota and prevents key leakage.

Sandboxed iframes are underrated. Running user-generated HTML inside <iframe sandbox="allow-scripts">

meant I could ship AI-generated code directly to the browser without worrying about XSS or DOM pollution.

Voilaa! was a genuinely fun project to build. The combination of Gemini's multimodal understanding and the flexibility of the @google/genai

SDK made what could have been a complex AI integration feel surprisingly clean. If you've got a YouTube rabbit hole you're currently lost in — try turning it into an interactive lesson instead. 🎬✨

── more in #artificial-intelligence 4 stories · sorted by recency
── more on @google gemini 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/voilaa-turning-any-y…] indexed:0 read:4min 2026-06-27 ·