You built a RAG pipeline. Works great in dev.
6 months later, your users complain: "The search results are garbage."
You haven't changed a line of code.
Here's what happened:
Your product evolved. New features, new docs, new support tickets. The data drifted — but your embedding index didn't.
Now you're serving a 400GB FAISS index that was last rebuilt in January. Your chunks are stale. Your nearest-neighbor results point to deprecated docs. Your LLM is confidently hallucinating from outdated context.
You need to fix this. 4 engineers each propose a solution:
A) Scheduled full rebuild
Every Sunday, re-embed the entire corpus from scratch. Replace the index atomically. Slow (4h+ at scale), expensive, but always fresh.
B) Incremental upserts + soft delete
On every document change, re-embed only the affected chunks. Mark deleted chunks as tombstoned. Keep a version field on each vector. Index size grows over time; compact quarterly.
C) Embedding version registry + hot swap
Track which embedding model version produced each vector. When the model drifts (fine-tuned or upgraded), invalidate the mismatched vectors and rebuild only those. Two indexes run in parallel during migration. Route traffic by model version.
D) Approximate staleness detection
Run a nightly job that samples 1% of your corpus, re-embeds it, and measures cosine distance against the stored vector. If drift exceeds a threshold, trigger a full rebuild. Otherwise, skip it. Cheap monitoring, reactive rebuilds.
Real constraint: your corpus is 50M chunks. Full rebuild = 4 hours + ~$800 in embedding API cost. You deploy model updates every 6 weeks.
Pick one — A, B, C, or D — and tell me why. Full breakdown in the comments.