cd /news/artificial-intelligence/i-indexed-936-lex-fridman-episodes-i… Β· home β€Ί topics β€Ί artificial-intelligence β€Ί article
[ARTICLE Β· art-27569] src=github.com β†— pub= topic=artificial-intelligence verified=true sentiment=↑ positive

I indexed 936 Lex Fridman episodes into a RAG that cites its sources

A developer built OmniPod, a RAG chatbot that indexes 936 Lex Fridman podcast episodes into 19,140 chunks and grounds every answer in verified transcripts, eliminating hallucinations. The system uses intent routing, bge-small-en-v1.5 embeddings, Qdrant vector search, and a groundedness verification step to provide cited answers for factual, synthetic, and generative queries.

read3 min publishedJun 15, 2026

Chat with 936 podcast episodes. Every answer cites its source.

Ask "What did Karpathy say about neural networks?" β€” get an answer with the exact transcript chunk it came from. No hallucinations. No guessing.

Most RAG chatbots hallucinate. You ask about a podcast, they invent quotes.

OmniPod doesn't. Every response is grounded β€” verified against the actual transcript before it reaches you. If the source doesn't support the answer, it says so.

Three query types, one pipeline:

Type Example Strategy
Factual "What did Huberman say about sleep?" Retrieve β†’ Generate β†’ Verify
Synthetic "Compare AI safety views across guests" Map-Reduce β†’ Deduplicate β†’ Synthesize
Generative "Write an essay on consciousness from these episodes" Plan β†’ Draft β†’ Ground
You ask a question
        β”‚
        β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚   Router     β”‚  classify_intent() β€” routes to the right handler
  β”‚  LRU cache   β”‚  avoids re-embedding repeated queries
  β”‚  Semaphore   β”‚  caps concurrent LLM calls at 5
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Retrieval   β”‚  bge-small-en-v1.5 (384d) β†’ Qdrant cosine
  β”‚  19,140      β”‚  chunks from 936 Lex Fridman episodes
  β”‚  chunks      β”‚  Guest filtering via known-guests index
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  Generate +  β”‚  DeepSeek V4 Flash via OpenCode API
  β”‚  Verify      β”‚  verify_groundedness() β€” rejects ungrounded answers
  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
  Cited answer in Chainlit UI (localhost:8000)
git clone https://github.com/aranajhonny/omnipod && cd omnipod
python3.13 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
echo "OPENCODE_API_KEY=sk-your-key" > .env
docker run -d --name qdrant -p 6333:6333 qdrant/qdrant
python ingest.py --rebuild
chainlit run app.py
Metric Value
Episodes indexed 936 Lex Fridman
Chunks 19,140 (512 chars, 128 overlap)
Embedding dim 384 (bge-small-en-v1.5, MPS GPU)
Query embedding ~100ms
Vector search ~50ms (cosine, 19K points)
Full answer ~2s on M1 Pro
Full ingest ~8 min
Codebase 1,138 lines Python, 9 files

No YouTube API key needed. Two sources:

lexfridman.comβ€” scrapes official transcript pages (requests + BeautifulSoup)** YouTube**β€” uses free proxy atyoutubetranscript.pro

for auto-captions

cd lex_podcast
pip install requests beautifulsoup4
python run.py pipeline  # scrapes all 936 episodes

Output lands in data/transcripts/

.

"What did Andrej Karpathy say about neural networks?"
"Compare views on AI safety across all guests"
"Write a short essay on human consciousness based on these episodes"
"Summarize what Andrew Huberman says about sleep"

Why 384-dim embeddings are fast to search and good enough for conversational podcast text. Runs locally on MPS GPU.bge-small-en-v1.5

?Why Qdrant over Chroma? Cosine search at 19K points in ~50ms. Filterable by guest metadata out of the box.Why intent routing? Factual, synthetic, and generative queries need fundamentally different retrieval and generation strategies. One prompt fits all fails at scale.Why groundedness verification? LLMs default to confident BS.verify_groundedness()

forces the model to check its answer against the retrieved context before showing it to the user.

MIT

── more in #artificial-intelligence 4 stories Β· sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/i-indexed-936-lex-fr…] indexed:0 read:3min 2026-06-15 Β· β€”