I Built a Vector Search Engine from Scratch — Here's What I Learned A developer built Vektr, a custom RAG engine implementing HNSW graphs from scratch, achieving recall@10 of 0.984 — meaning 98.4% of queries returned all 10 true nearest neighbors in the top results. The implementation combines hybrid BM25 plus dense retrieval, HyDE query rewriting, and atomic index persistence. The project demonstrates that building approximate nearest neighbor search from the ground up can match production-grade vector databases in accuracy. Implementing HNSW Hierarchical Navigable Small World graphs, hybrid BM25 + dense retrieval, HyDE query rewriting, and atomic index persistence — achieving recall@10 = 0.984. When I started building Vektr — a RAG Retrieval-Augmented Generation engine — I had a choice: use an existing vector database like Pinecone, Weaviate, or FAISS, or build my own. I chose to build my own. Not because existing solutions are bad they're excellent , but because you don't truly understand a system until you've built it . This post is about what I learned building HNSW from scratch. HNSW Hierarchical Navigable Small World is the algorithm powering most modern vector databases. It achieves near-linear search time with high recall by organizing vectors into a hierarchical graph. The key insight: approximate nearest neighbor search is fast enough, and "approximate" is closer to exact than you'd think . My implementation achieves recall@10 = 0.984 — meaning for 98.4% of queries, all 10 true nearest neighbors appear in the top 10 results. Layer 2 sparse : 1 ──────────── 5 │ │ Layer 1 medium : 1 ── 3 ── 4 ── 5 │ │ │ │ Layer 0 dense : 1─2─3─4─5─6─7─8─9 Each vector is inserted at layer 0. With probability 1/ln M , it also appears in layer 1, and so on. This creates a highway network — you navigate quickly through sparse upper layers, then zoom in at the dense bottom layer. public class HNSWIndex { private final int M; // Max connections per node private final int efConstruction; // Search width during construction private final int maxLayer; private final Map