I indexed 936 Lex Fridman episodes into a RAG that cites its sources

wpnews.pro

cd /news/artificial-intelligence/i-indexed-936-lex-fridman-episodes-i… · home › topics › artificial-intelligence › article

[ARTICLE · art-27569] src=github.com ↗ pub=2026-06-15T04:24Z topic=artificial-intelligence verified=true sentiment=↑ positive

I indexed 936 Lex Fridman episodes into a RAG that cites its sources

A developer built OmniPod, a RAG chatbot that indexes 936 Lex Fridman podcast episodes into 19,140 chunks and grounds every answer in verified transcripts, eliminating hallucinations. The system uses intent routing, bge-small-en-v1.5 embeddings, Qdrant vector search, and a groundedness verification step to provide cited answers for factual, synthetic, and generative queries.

read3 min views15 publishedJun 15, 2026

Chat with 936 podcast episodes. Every answer cites its source.

Ask "What did Karpathy say about neural networks?" — get an answer with the exact transcript chunk it came from. No hallucinations. No guessing.

Most RAG chatbots hallucinate. You ask about a podcast, they invent quotes.

OmniPod doesn't. Every response is grounded — verified against the actual transcript before it reaches you. If the source doesn't support the answer, it says so.

Three query types, one pipeline:

Type	Example	Strategy
Factual	"What did Huberman say about sleep?"	Retrieve → Generate → Verify
Synthetic	"Compare AI safety views across guests"	Map-Reduce → Deduplicate → Synthesize
Generative	"Write an essay on consciousness from these episodes"	Plan → Draft → Ground

You ask a question
        │
        ▼
  ┌─────────────┐
  │   Router     │  classify_intent() — routes to the right handler
  │  LRU cache   │  avoids re-embedding repeated queries
  │  Semaphore   │  caps concurrent LLM calls at 5
  └──────┬──────┘
         │
         ▼
  ┌─────────────┐
  │  Retrieval   │  bge-small-en-v1.5 (384d) → Qdrant cosine
  │  19,140      │  chunks from 936 Lex Fridman episodes
  │  chunks      │  Guest filtering via known-guests index
  └──────┬──────┘
         │
         ▼
  ┌─────────────┐
  │  Generate +  │  DeepSeek V4 Flash via OpenCode API
  │  Verify      │  verify_groundedness() — rejects ungrounded answers
  └──────┬──────┘
         │
         ▼
  Cited answer in Chainlit UI (localhost:8000)
git clone https://github.com/aranajhonny/omnipod && cd omnipod
python3.13 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
echo "OPENCODE_API_KEY=sk-your-key" > .env
docker run -d --name qdrant -p 6333:6333 qdrant/qdrant
python ingest.py --rebuild
chainlit run app.py

Metric	Value
Episodes indexed	936 Lex Fridman
Chunks	19,140 (512 chars, 128 overlap)
Embedding dim	384 (bge-small-en-v1.5, MPS GPU)
Query embedding	~100ms
Vector search	~50ms (cosine, 19K points)
Full answer	~2s on M1 Pro
Full ingest	~8 min
Codebase	1,138 lines Python, 9 files

No YouTube API key needed. Two sources:

lexfridman.com— scrapes official transcript pages (requests + BeautifulSoup)** YouTube**— uses free proxy atyoutubetranscript.pro

for auto-captions

cd lex_podcast
pip install requests beautifulsoup4
python run.py pipeline  # scrapes all 936 episodes

Output lands in data/transcripts/

"What did Andrej Karpathy say about neural networks?"
"Compare views on AI safety across all guests"
"Write a short essay on human consciousness based on these episodes"
"Summarize what Andrew Huberman says about sleep"

Why 384-dim embeddings are fast to search and good enough for conversational podcast text. Runs locally on MPS GPU.bge-small-en-v1.5

?Why Qdrant over Chroma? Cosine search at 19K points in ~50ms. Filterable by guest metadata out of the box.Why intent routing? Factual, synthetic, and generative queries need fundamentally different retrieval and generation strategies. One prompt fits all fails at scale.Why groundedness verification? LLMs default to confident BS.verify_groundedness()

forces the model to check its answer against the retrieved context before showing it to the user.

MIT

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/i-indexed-936-lex-fridma…

Read original on github.com → github.com/aranajhonny/omnipod

mentioned entities

Lex Fridman

Andrej Karpathy

Andrew Huberman

DeepSeek

Qdrant

OpenCode

Chainlit

MIT

metadata

slugi-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalgithub.com

navigation

← prev[FREE] A self-hosted WordPress a…

next →Proving what a military AI model…

── more in #artificial-intelligence 4 stories · sorted by recency

discuss.huggingface.co · 30 Jul · #artificial-intelligence

Introducing BLUM — An Open-Source Autonomous Financial Intelligence Space# Introducing BLUM — An Open-Source Autonomous Financial Intelligence Space

techfundingnews.com · 30 Jul · #artificial-intelligence

Sure Valley Ventures has been running on AI agents for 18 months. Now it’s open-sourcing everything it learned: Here’s what’s inside

snipvote.com · 30 Jul · #artificial-intelligence

LLM agents in Werewolf game hide misaligned objectives in public talk

github.com · 30 Jul · #artificial-intelligence

Graft

── more on @lex fridman 3 stories trending now

wpnews · 28 Jul · #large-language-models

How to Download and Run Kimi K3 Open Weights

wpnews · 29 Jul · #ai-safety

News Summary for July 29, 2026

wpnews · 29 Jul · #ai-agents

Compliance-Ready AI Agents: Logging and Tracing Every MCP Tool Call with Bifrost

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required