{"slug": "i-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources", "title": "I indexed 936 Lex Fridman episodes into a RAG that cites its sources", "summary": "A developer built OmniPod, a RAG chatbot that indexes 936 Lex Fridman podcast episodes into 19,140 chunks and grounds every answer in verified transcripts, eliminating hallucinations. The system uses intent routing, bge-small-en-v1.5 embeddings, Qdrant vector search, and a groundedness verification step to provide cited answers for factual, synthetic, and generative queries.", "body_md": "**Chat with 936 podcast episodes. Every answer cites its source.**\n\nAsk \"What did Karpathy say about neural networks?\" — get an answer with the exact transcript chunk it came from. No hallucinations. No guessing.\n\nMost RAG chatbots hallucinate. You ask about a podcast, they invent quotes.\n\nOmniPod doesn't. Every response is **grounded** — verified against the actual transcript before it reaches you. If the source doesn't support the answer, it says so.\n\n**Three query types, one pipeline:**\n\n| Type | Example | Strategy |\n|---|---|---|\n| Factual | \"What did Huberman say about sleep?\" | Retrieve → Generate → Verify |\n| Synthetic | \"Compare AI safety views across guests\" | Map-Reduce → Deduplicate → Synthesize |\n| Generative | \"Write an essay on consciousness from these episodes\" | Plan → Draft → Ground |\n\n```\nYou ask a question\n        │\n        ▼\n  ┌─────────────┐\n  │   Router     │  classify_intent() — routes to the right handler\n  │  LRU cache   │  avoids re-embedding repeated queries\n  │  Semaphore   │  caps concurrent LLM calls at 5\n  └──────┬──────┘\n         │\n         ▼\n  ┌─────────────┐\n  │  Retrieval   │  bge-small-en-v1.5 (384d) → Qdrant cosine\n  │  19,140      │  chunks from 936 Lex Fridman episodes\n  │  chunks      │  Guest filtering via known-guests index\n  └──────┬──────┘\n         │\n         ▼\n  ┌─────────────┐\n  │  Generate +  │  DeepSeek V4 Flash via OpenCode API\n  │  Verify      │  verify_groundedness() — rejects ungrounded answers\n  └──────┬──────┘\n         │\n         ▼\n  Cited answer in Chainlit UI (localhost:8000)\ngit clone https://github.com/aranajhonny/omnipod && cd omnipod\npython3.13 -m venv .venv && source .venv/bin/activate\npip install -r requirements.txt\necho \"OPENCODE_API_KEY=sk-your-key\" > .env\ndocker run -d --name qdrant -p 6333:6333 qdrant/qdrant\npython ingest.py --rebuild\nchainlit run app.py\n# → http://localhost:8000\n```\n\n| Metric | Value |\n|---|---|\n| Episodes indexed | 936 Lex Fridman |\n| Chunks | 19,140 (512 chars, 128 overlap) |\n| Embedding dim | 384 (bge-small-en-v1.5, MPS GPU) |\n| Query embedding | ~100ms |\n| Vector search | ~50ms (cosine, 19K points) |\n| Full answer | ~2s on M1 Pro |\n| Full ingest | ~8 min |\n| Codebase | 1,138 lines Python, 9 files |\n\nNo YouTube API key needed. Two sources:\n\n**lexfridman.com**— scrapes official transcript pages (requests + BeautifulSoup)** YouTube**— uses free proxy at`youtubetranscript.pro`\n\nfor auto-captions\n\n```\ncd lex_podcast\npip install requests beautifulsoup4\npython run.py pipeline  # scrapes all 936 episodes\n```\n\nOutput lands in `data/transcripts/`\n\n.\n\n```\n\"What did Andrej Karpathy say about neural networks?\"\n\"Compare views on AI safety across all guests\"\n\"Write a short essay on human consciousness based on these episodes\"\n\"Summarize what Andrew Huberman says about sleep\"\n```\n\n**Why** 384-dim embeddings are fast to search and good enough for conversational podcast text. Runs locally on MPS GPU.`bge-small-en-v1.5`\n\n?**Why Qdrant over Chroma?** Cosine search at 19K points in ~50ms. Filterable by guest metadata out of the box.**Why intent routing?** Factual, synthetic, and generative queries need fundamentally different retrieval and generation strategies. One prompt fits all fails at scale.**Why groundedness verification?** LLMs default to confident BS.`verify_groundedness()`\n\nforces the model to check its answer against the retrieved context before showing it to the user.\n\nMIT", "url": "https://wpnews.pro/news/i-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources", "canonical_source": "https://github.com/aranajhonny/omnipod", "published_at": "2026-06-15 04:24:28+00:00", "updated_at": "2026-06-15 04:41:42.618356+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-tools", "ai-agents", "natural-language-processing"], "entities": ["Lex Fridman", "Andrej Karpathy", "Andrew Huberman", "DeepSeek", "Qdrant", "OpenCode", "Chainlit", "MIT"], "alternates": {"html": "https://wpnews.pro/news/i-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources", "markdown": "https://wpnews.pro/news/i-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources.md", "text": "https://wpnews.pro/news/i-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources.txt", "jsonld": "https://wpnews.pro/news/i-indexed-936-lex-fridman-episodes-into-a-rag-that-cites-its-sources.jsonld"}}