AI-powered podcast-to-video pipeline. Converts a diarized audio file (NotebookLM, podcast, interview) into a YouTube-ready MP4 with:
- Semantically matched Pexels B-roll per utterance (GPT-4o-mini picks the clip)
- Burned-in subtitles (no ffmpeg libass required — pure Pillow)
- Optional OpenAI TTS voice replacement (swap out NotebookLM / AI voices)
- YouTube upload + description/thumbnail update
- One-shot multi-platform social posting (Discord, Telegram, X, Moltbook, LinkedIn)
git clone https://github.com/spacepacket1/e3d-pod2vid.git
cd e3d-pod2vid
pip install -r requirements.txt
npm install
cp .env.example .env
$EDITOR .env
python3 pod2vid.py episode.m4a output/episode.mp4
This single command:
- Uploads audio to AssemblyAI for speaker diarization
- Asks GPT-4o-mini for a specific Pexels search query per utterance
- Downloads matching B-roll clips (cached per query)
- Renders each segment with burned-in subtitles
- Concatenates into a final MP4 + SRT subtitle file
Caches diarization and queries as JSON so re-runs are fast.
If you want custom voices instead of the original audio (e.g. replace NotebookLM voices):
python3 tts_replace.py output/episode-diarization.json episode-tts
python3 pod2vid.py output/episode-tts.mp3 output/episode-tts.mp4
Default voices: onyx (Speaker A) and nova (Speaker B). Override with VOICE_A
/ VOICE_B
.
Available voices: alloy
, echo
, fable
, onyx
, nova
, shimmer
python3 make_thumbnail.py "Predictive GPS for Autonomous AI Agents" thumbnail.png /path/to/logo.png
Outputs a 1280×720 PNG with title, accent stripe, and optional logo overlay. Pure Pillow — no browser or design tool required.
First time: authorize your account
node yt_auth.js
The script prints a URL. Open it on any device (phone, browser — the machine running the script doesn't need a browser). After approving, paste the redirect URL back into the terminal. Tokens are saved to youtube-tokens.json
.
Upload the video
node yt_upload.js output/episode-tts.mp4 "My Episode Title"
Prints the video URL and ID when done.
Update description and thumbnail
YT_DESCRIPTION="Check out maps.e3d.ai — AI-powered GPS for autonomous vehicles.
Follow us:
• X: @e3dmaps
• Discord: https://discord.gg/your-server" \
node yt_update.js VIDEO_ID thumbnail.png
node announce.js https://www.youtube.com/watch?v=VIDEO_ID "New episode: Predictive GPS for Autonomous AI Agents"
Posts simultaneously to all configured platforms. Platforms with no credentials are silently skipped.
| Platform | Credential(s) needed |
|---|---|
| Discord | DISCORD_BOT_TOKEN + DISCORD_CHANNEL_ID |
| Telegram | TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID |
| X (Twitter) | X_ACCESS_TOKEN |
| Moltbook | MOLTBOOK_API_KEY |
linkedin-tokens.json with person_urn (run node linkedin_auth.js ) |
LinkedIn's API requires a few one-time setup steps before announce.js
can post there.
Step 1 — Create a LinkedIn app
Go to linkedin.com/developers/apps and create an app. Under the Auth tab, add this as an authorized redirect URL:
https://www.linkedin.com/developers/tools/oauth/redirect
Step 2 — Add required products
Under the Products tab, request access to both:
Share on LinkedIn— grantsw_member_social
scope (post on behalf of user)Sign In with LinkedIn using OpenID Connect— grantsopenid profile
scopes (needed to resolve your person URN)
Both are typically approved instantly for personal apps.
Step 3 — Verify company association (if prompted)
LinkedIn may ask you to verify a company page association. Open the verification URL while logged in as a Page Admin and approve it.
Step 4 — Authorize and get tokens
Add your app credentials to .env
:
LINKEDIN_CLIENT_ID=your_client_id
LINKEDIN_CLIENT_SECRET=your_client_secret
Then run:
node linkedin_auth.js
Open the printed URL on any device. After approving, paste the redirect URL back. Tokens are saved to linkedin-tokens.json
.
Step 5 — Add your person URN
LinkedIn's API requires your encoded person ID (not your numeric member ID). To find it:
-
Go to your LinkedIn profile in a browser
-
View Page Source (Cmd+U / Ctrl+U) and search for
urn:li:member: -
Note the numeric ID (e.g.
4435724
) - Make a test API call — the error response will reveal your encoded person URN (e.g.
urn:li:person:2KqUAyg4oY
)
Or run this one-liner after getting a token:
node -e "
const https = require('https');
const t = JSON.parse(require('fs').readFileSync('linkedin-tokens.json'));
// Replace MEMBER_ID with your numeric ID from page source
const body = JSON.stringify({author:'urn:li:member:MEMBER_ID',commentary:'test',visibility:'PUBLIC',distribution:{feedDistribution:'MAIN_FEED',targetEntities:[],thirdPartyDistributionChannels:[]},lifecycleState:'PUBLISHED',isReshareDisabledByAuthor:false});
const u = require('url').parse('https://api.linkedin.com/rest/posts');
const r = https.request(Object.assign(u,{method:'POST',headers:{'Authorization':'Bearer '+t.access_token,'Content-Type':'application/json','Content-Length':Buffer.byteLength(body),'LinkedIn-Version':'202506','X-Restli-Protocol-Version':'2.0.0'}}),res=>{let d='';res.on('data',c=>d+=c);res.on('end',()=>console.log(d.slice(0,300)));});
r.write(body);r.end();
"
The error message will contain your encoded URN. Save it:
node -e "
const fs = require('fs');
const t = JSON.parse(fs.readFileSync('linkedin-tokens.json'));
t.person_urn = 'urn:li:person:YOUR_ENCODED_ID';
fs.writeFileSync('linkedin-tokens.json', JSON.stringify(t, null, 2));
"
Once linkedin-tokens.json
contains person_urn
, announce.js
will post to LinkedIn automatically.
Copy .env.example
to .env
and fill in the keys you need.
| Variable | Required for | Notes |
|---|---|---|
ASSEMBLYAI_API_KEY |
||
pod2vid.py |
||
OPENAI_API_KEY
pod2vid.py
, tts_replace.py
PEXELS_API_KEY
pod2vid.py
pexels.com/api— freeDISCORD_BOT_TOKEN
announce.js
DISCORD_CHANNEL_ID
announce.js
TELEGRAM_BOT_TOKEN
announce.js
TELEGRAM_CHAT_ID
announce.js
X_ACCESS_TOKEN
announce.js
MOLTBOOK_API_KEY
announce.js
MOLTBOOK_SUBMOLT
announce.js
agentfinance
)LINKEDIN_CLIENT_ID
linkedin_auth.js
LinkedIn Developer PortalLINKEDIN_CLIENT_SECRET
linkedin_auth.js
LINKEDIN_TOKEN_FILE
announce.js
linkedin-tokens.json
— must contain person_urn
VOICE_A
tts_replace.py
onyx
VOICE_B
tts_replace.py
nova
SPEAKER_A_NAME
pod2vid.py
Host
)SPEAKER_B_NAME
pod2vid.py
Guest
)YT_PRIVACY
yt_upload.js
public
/ unlisted
/ private
YT_DESCRIPTION
yt_update.js
Instead of rotating through a fixed clip library, this pipeline asks GPT-4o-mini to generate a specific Pexels search query for each utterance:
"EZPass saved us 90 seconds at every toll plaza"
→ "toll booth highway payment"
"the dual-witness problem"
→ "courtroom judge testimony"
"machine learning position predictions"
→ "machine learning data training loop"
Queries are cached so re-runs or TTS voice swaps don't re-spend API credits. ~82 unique clips across a 90-segment episode is typical.
Python 3.8+
- Pillow >= 10.0
- python-dotenv >= 1.0
- ffmpeg (any version — subtitle rendering does not require libfreetype/libass)
Node.js 18+
- dotenv
External APIs
- AssemblyAI (diarization)
- OpenAI (GPT-4o-mini + TTS)
- Pexels (B-roll clips, free tier fine for personal use)
- YouTube Data API v3 (via Google Cloud Console)
- LinkedIn API (via LinkedIn Developer Portal) — optional, for posting
output/
episode.mp4 final video
episode.srt subtitle file for YouTube CC
episode-diarization.json cached AssemblyAI result
episode-queries.json cached GPT Pexels queries
broll/ cached B-roll clips (one per unique query)
tts-cache/ cached TTS utterances (per voice+text hash)
Built by E3D Maps — AI-powered navigation for autonomous vehicles.
MIT