# HeyGen Biweekly Video skill for Claude Code — avatar-hosted engineering updates from GitHub activity. Drop into ~/.claude/skills/ and invoke with /heygen-biweekly-video.

> Source: <https://gist.github.com/miguel-heygen/9337897578a915f4645d39d9d0b20703>
> Published: 2026-05-30 00:50:57+00:00

| name | heygen-biweekly-video |
|---|---|
| description | Produce a launch-grade biweekly team-update video — avatar-hosted (HeyGen CLI), built in HyperFrames, with real captured product UI, real preview videos, kinetic captions, smooth camera moves, and music ducked under the voice. Use when: biweekly / sprint recap video, avatar-narrated dev update, changelog-as-video. |

A repeatable pipeline for an After-Effects-quality team update: a HeyGen avatar narrates
over **real captured product UI** + motion graphics, every beat synced to the avatar's voice.
Output: 1920×1080 / 60fps / ~90s.

`heygen`

CLI authed (`heygen auth status`

) — get it at[https://github.com/heygen-com/skills](https://github.com/heygen-com/skills)`hyperframes`

CLI (`npx hyperframes`

) —[https://github.com/heygen-com/hyperframes](https://github.com/heygen-com/hyperframes)`bun`

,`ffmpeg`

/`ffprobe`

,`agent-browser`

,`gh`

,`jq`

**Everything REAL — no mockups.** Capture actual product UI via website-to-html; use real preview videos; use real numbers from GitHub.**The avatar's voice is the single narration spine.** One continuous take; transcribe for word-level timings; sync every act and caption to those cues.**Verify by rendering frames**— extract a frame (`ffmpeg -ss <t> ... -vframes 1`

) and*look*before claiming any scene works.

| # | Phase | What to do |
|---|---|---|
| 0 | Reference |
Study `heygen-com/hyperframes-launches` — the quality bar. Pick the closest launch, reuse its music/SFX. |
| 1 | Count |
`gh search prs --author=@me --created=<start>..<end>` → cluster into 3 hero themes |
| 2 | Numbers |
Pull public metrics (GitHub stars, PRs, releases, contributors, catalog size) |
| 3 | Avatar |
Write ~250-word script → `heygen video create -d '{"type":"avatar","avatar_id":"<ID>","script":"...","output_format":"webm"}'` → transparent webm → transcribe for cues |
| 4 | Capture UI |
Run your app locally → capture via agent-browser + CDP MHTML → convert to standalone HTML |
| 5 | Build acts |
cold-open → intro → hero sections → stats → CTA/outro |
| 6 | Wire master |
Avatar full→PiP→full, music + SFX + voice |
| 7 | Render + master |
`hyperframes render --fps 60 --quality high` → duck music under voice → deliver |

```
heygen auth status
heygen avatar looks get <LOOK_ID>    # confirm look + engines; group_id is NOT a 2nd avatar

# write script to script.txt (~250 words ≈ 90s at 150wpm), then:
jq -Rs '{type:"avatar",avatar_id:"<ID>",script:.,output_format:"webm",aspect_ratio:"9:16",resolution:"1080p"}' \
  script.txt > req.json
heygen video create -d req.json      # → video_id; webm = transparent (alpha_mode=1)

# poll until done
VID=<video_id>
for i in $(seq 1 90); do
  s=$(heygen video get $VID | jq -r .data.status)
  [ "$s" = completed ] && break; sleep 20
done

heygen video download $VID --output-path assets/host.webm --force
ffmpeg -y -i assets/host.webm -vn -c:a aac -b:a 192k -ar 48000 assets/host-voice.m4a
npx hyperframes transcribe assets/host.webm --model base.en    # → word-level transcript.json
```

Fix Whisper mishears (product names like "HeyGen", "HyperFrames") before generating captions.

```
# open your app
agent-browser set viewport 1920 1080
agent-browser open "http://localhost:5190/"
agent-browser wait --load networkidle

# drive to the exact state (click tabs, seek timeline, select elements)
agent-browser click @eN
agent-browser screenshot /tmp/state.png   # reference

# capture MHTML via CDP
agent-browser get cdp-url | grep -o 'ws://[^ ]*' > /tmp/cdp.txt
# grab-mhtml.mjs (bundled below): attaches to the page target, runs Page.captureSnapshot
bun grab-mhtml.mjs "$(cat /tmp/cdp.txt)" /tmp/page.mhtml

# convert to standalone HTML
bun mhtml-to-html.mjs /tmp/page.mhtml captures/page.html
perl -i -pe 's/[\w.-]*\@mhtml\.blink/about:blank/g' captures/page.html

# wrap into a HyperFrames sub-comp backdrop
bun build-studio-bg.mjs captures/page.html compositions/page-bg.html page-bg 15
```

**Light-themed captures (Next.js etc.) MUST be iframe-isolated** — their global CSS leaks and turns the whole video white. Use `<iframe src="../captures/page.html">`

instead of inlining.

A flat `data-volume`

on the music track is NOT enough — the track's body is far louder than its intro. Flatten dynamics first, then duck:

```
# master-audio.sh (bundled below):
bash master-audio.sh renders/final.mp4 assets/music.mp3 assets/host-voice.m4a <voiceStartSec> \
  ~/Downloads/output.mp4 0.03 0.22
# dynaudnorm flattens the music → envelope ducks to 3% under voice → loudnorm -14 LUFS → mux over video
```

**Light captures leak CSS globally**→ iframe-isolate them** Use**, never`gsap.fromTo()`

in sub-compositions`gsap.from()`

—`immediateRender`

breaks**Never CSS**— use`transform`

+ GSAP transform on same element`xPercent`

/`yPercent`

or flex centeringor the renderer reports FROZEN/SILENT`<video>`

/`<audio>`

need an`id`

**One stat per scene** for data acts — three at once overlap**Verify by extracting frames**— collisions are invisible until you look** Avatar look vs group ID**— they can look like two avatars but be one

``` js
#!/usr/bin/env bun
import { writeFileSync } from "node:fs";
const [url, outPath] = process.argv.slice(2);
const ws = new WebSocket(url);
const send = (o) => ws.send(JSON.stringify(o));
ws.addEventListener("open", () => send({ id: 1, method: "Target.getTargets" }));
ws.addEventListener("message", (ev) => {
  const d = JSON.parse(ev.data);
  if (d.id === 1) {
    const t = d.result.targetInfos.find((t) => t.type === "page" && t.url.includes("localhost"));
    if (!t) { console.log("no page target"); ws.close(); return; }
    send({ id: 2, method: "Target.attachToTarget", params: { targetId: t.targetId, flatten: true } });
  }
  if (d.id === 2) send({ id: 3, sessionId: d.result.sessionId, method: "Page.captureSnapshot", params: { format: "mhtml" } });
  if (d.id === 3) {
    if (d.result?.data) { writeFileSync(outPath, d.result.data); console.log("MHTML OK:", d.result.data.length, "bytes"); }
    ws.close();
  }
});
setTimeout(() => process.exit(0), 20000);
js
#!/usr/bin/env bun
import { readFileSync, writeFileSync } from "node:fs";
const [inPath, outPath] = process.argv.slice(2);
const raw = readFileSync(inPath, "latin1");
const boundaryMatch = raw.match(/boundary="([^"]+)"/);
const boundary = "--" + boundaryMatch[1];
const chunks = raw.split(boundary).filter((c) => c.trim() && !c.trim().startsWith("--"));
function parsePart(chunk) {
  const idx = chunk.search(/\r?\n\r?\n/);
  if (idx === -1) return null;
  const headerBlock = chunk.slice(0, idx);
  const sepLen = chunk.slice(idx).match(/^\r?\n\r?\n/)[0].length;
  let body = chunk.slice(idx + sepLen);
  const headers = {};
  for (const line of headerBlock.split(/\r?\n/)) {
    const m = line.match(/^([\w-]+):\s*(.*)$/);
    if (m) headers[m[1].toLowerCase()] = m[2].trim();
  }
  return { headers, body };
}
function decodeQuotedPrintable(str) {
  const noSoft = str.replace(/=\r?\n/g, "");
  const bytes = [];
  for (let i = 0; i < noSoft.length; i++) {
    if (noSoft[i] === "=" && i + 2 < noSoft.length && /^[0-9A-Fa-f]{2}$/.test(noSoft.substr(i + 1, 2))) {
      bytes.push(parseInt(noSoft.substr(i + 1, 2), 16)); i += 2;
    } else bytes.push(noSoft.charCodeAt(i) & 0xff);
  }
  return Buffer.from(bytes);
}
function decodeBody(headers, body) {
  const enc = (headers["content-transfer-encoding"] || "").toLowerCase();
  if (enc === "base64") return Buffer.from(body.replace(/\s+/g, ""), "base64");
  if (enc === "quoted-printable") return decodeQuotedPrintable(body);
  return Buffer.from(body, "latin1");
}
let htmlPart = null;
const resourceMap = new Map();
for (const chunk of chunks) {
  const part = parsePart(chunk);
  if (!part) continue;
  const ctype = (part.headers["content-type"] || "").split(";")[0].trim();
  const loc = part.headers["content-location"];
  const buf = decodeBody(part.headers, part.body);
  if (ctype === "text/html" && !htmlPart) { htmlPart = buf.toString("utf8"); continue; }
  if (loc) resourceMap.set(loc, `data:${ctype};base64,${buf.toString("base64")}`);
}
let html = htmlPart;
for (const loc of [...resourceMap.keys()].sort((a, b) => b.length - a.length)) html = html.split(loc).join(resourceMap.get(loc));
html = html.replace(/<script[\s\S]*?<\/script>/gi, "");
html = html.replace(/<link[^>]*rel=["']?(?:preload|prefetch|modulepreload)["']?[^>]*>/gi, "");
writeFileSync(outPath, html, "utf8");
console.log(`wrote ${outPath} (${(html.length / 1024).toFixed(0)} KB) — inlined ${resourceMap.size} resources`);
bash
#!/usr/bin/env bash
set -euo pipefail
VIDEO=$1; MUSIC=$2; VOICE=$3; VSTART=$4; OUT=$5; DUCK=${6:-0.05}; INTRO=${7:-0.22}
DUR=$(ffprobe -v error -show_entries format=duration -of csv=p=0 "$VIDEO")
DELAY=$(awk "BEGIN{printf \"%d\", $VSTART*1000}")
TMP=$(mktemp -d)/master.m4a
ffmpeg -y -v error -i "$MUSIC" -i "$VOICE" -filter_complex "\
[0:a]atrim=0:${DUR},dynaudnorm=f=200:g=15,volume='if(lt(t,${VSTART}),${INTRO},${DUCK})':eval=frame[bed]; \
[1:a]adelay=${DELAY}|${DELAY},apad=whole_dur=${DUR},volume=1.0[vox]; \
[bed][vox]amix=inputs=2:duration=longest:normalize=0[mix]; \
[mix]loudnorm=I=-14:TP=-1.5:LRA=11[out]" \
  -map "[out]" -t "$DUR" -c:a aac -b:a 256k "$TMP"
ffmpeg -y -v error -i "$VIDEO" -i "$TMP" -map 0:v -map 1:a -c:v copy -c:a aac -movflags +faststart "$OUT"
echo "wrote $OUT  (music ducked to ${DUCK} under voice, -14 LUFS)"
```

Quick way (installs the HeyGen + HyperFrames CLI skills that this workflow uses):

```
npx skills add heygen-com/skills
npx skills add heygen-com/hyperframes
```

Then for this skill:

Save this file as `~/.claude/skills/heygen-biweekly-video/SKILL.md`

, then invoke with `/heygen-biweekly-video`

in Claude Code.
