{"slug": "i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations", "title": "I built a local-first movie recommender with Corrective-RAG (cited explanations, hybrid retrieval, runs entirely on Ollama)", "summary": "A developer built a local-first movie recommendation system using a Corrective-RAG pipeline that runs entirely on Ollama. The system employs query expansion at ingestion time rather than query time, generating 3-5 pseudo-queries per movie to improve scalability. On an M3 Mac with 36GB RAM, the system achieves approximately 90-second query latency with llama3, dropping to 15-20 seconds with llama3.2:1b.", "body_md": "Hey — sharing a project I've been building for the last\n\nfew months. It's a movie recommendation system that runs entirely on\n\nyour laptop using Ollama, with a Corrective-RAG pipeline.\n\nWhy I built it: existing streaming platforms only know what you\n\nwatched on them. Netflix can't see my Prime history, none of them know\n\nabout cinema watches. Wanted one system that learns from all of it.\n\nStack:\n\nThe interesting design choice was query expansion at INGEST time instead\n\nof query time. The enrichment LLM generates 3-5 pseudo-queries per movie\n\nand embeds them alongside the plot. Catalogues are bounded; user queries\n\naren't, so paying the LLM cost once per movie scales better than once\n\nper query.\n\nLatency on M3 / 36GB / Ollama llama3: ~90s/query (filter_extract +\n\nexplain dominate). llama3.2:1b drops to ~15-20s. Hosted models ~5-10s.\n\nCode + setup: github.com/meetgrewal7793-creator/personal-movie-recommender\n\nThe 7-stage architecture diagram is in the README. Feedback welcome —\n\nespecially on the grader prompt calibration, which I had to relax for\n\nlocal-LLM defaults because llama3 graders over-flag results as weak.", "url": "https://wpnews.pro/news/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations", "canonical_source": "https://dev.to/a_aesthetic_dbd654c063b47/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations-hybrid-retrieval-1iog", "published_at": "2026-05-25 22:50:24+00:00", "updated_at": "2026-05-25 23:33:57.068311+00:00", "lang": "en", "topics": ["large-language-models", "generative-ai", "ai-products", "ai-tools", "ai-infrastructure"], "entities": ["Ollama", "Netflix", "Prime", "llama3", "llama3.2", "M3", "meetgrewal7793-creator"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations", "markdown": "https://wpnews.pro/news/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations.md", "text": "https://wpnews.pro/news/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations.txt", "jsonld": "https://wpnews.pro/news/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations.jsonld"}}