# I built a local-first movie recommender with Corrective-RAG (cited explanations, hybrid retrieval, runs entirely on Ollama)

> Source: <https://dev.to/a_aesthetic_dbd654c063b47/i-built-a-local-first-movie-recommender-with-corrective-rag-cited-explanations-hybrid-retrieval-1iog>
> Published: 2026-05-25 22:50:24+00:00

Hey — sharing a project I've been building for the last

few months. It's a movie recommendation system that runs entirely on

your laptop using Ollama, with a Corrective-RAG pipeline.

Why I built it: existing streaming platforms only know what you

watched on them. Netflix can't see my Prime history, none of them know

about cinema watches. Wanted one system that learns from all of it.

Stack:

The interesting design choice was query expansion at INGEST time instead

of query time. The enrichment LLM generates 3-5 pseudo-queries per movie

and embeds them alongside the plot. Catalogues are bounded; user queries

aren't, so paying the LLM cost once per movie scales better than once

per query.

Latency on M3 / 36GB / Ollama llama3: ~90s/query (filter_extract +

explain dominate). llama3.2:1b drops to ~15-20s. Hosted models ~5-10s.

Code + setup: github.com/meetgrewal7793-creator/personal-movie-recommender

The 7-stage architecture diagram is in the README. Feedback welcome —

especially on the grader prompt calibration, which I had to relax for

local-LLM defaults because llama3 graders over-flag results as weak.
