RAG with Spring Boot — Embeddings and Vector Search Step by Step (2026)

A developer published a tutorial on building a Retrieval-Augmented Generation (RAG) pipeline using Spring Boot, embeddings, and vector search. The tutorial demonstrates how to ingest documents, split them into chunks, generate embeddings, store them in a PGVector vector store, and answer user questions by retrieving similar chunks and passing them to an LLM. The code is available on GitHub and extends the AI Developer Tutorials series.

Canonical URL:Republished from munonye.com . Full code on GitHub . Learn how to build a RAG Spring Boot tutorial pipeline that answers questions from your own documents. This post extends the AI Developer Tutorials https://www.munonye.com/ai-developer-tutorials/ series and connects to M7-A Spring AI REST basics https://www.munonye.com/spring-ai-tutorial-first-rest-endpoint-openai-2026/ . Documents → chunk → embed → VectorStore User question → embed → top-K similar chunks → prompt → LLM → answer <dependency <groupId org.springframework.ai</groupId <artifactId spring-ai-pgvector-store-spring-boot-starter</artifactId </dependency <dependency <groupId org.springframework.ai</groupId <artifactId spring-ai-openai-spring-boot-starter</artifactId </dependency @Service public class DocumentIngestionService { private final VectorStore vectorStore; private final Resource docsFolder; public DocumentIngestionService VectorStore vectorStore, @Value "classpath:docs/" Resource docsFolder { this.vectorStore = vectorStore; this.docsFolder = docsFolder; } public void ingestAll throws IOException { for Resource file : docsFolder.getFile .listFiles { String text = Files.readString file.getFile .toPath ; List<Document chunks = split text, 800, 100 ; vectorStore.add chunks ; } } private List<Document split String text, int size, int overlap { List<Document out = new ArrayList< ; for int i = 0; i < text.length ; i += size - overlap { out.add new Document text.substring i, Math.min i + size, text.length ; } return out; } } @PostMapping "/api/ask" public AnswerResponse ask @RequestBody QuestionRequest req { List<Document similar = vectorStore.similaritySearch req.question , 5 ; String context = similar.stream .map Document::getContent .collect Collectors.joining "\n---\n" ; String answer = chatClient.prompt .system "Answer only from the context below. Say 'I don't know' if not found.\n" + context .user req.question .call .content ; return new AnswerResponse answer ; } M8-B — Structured JSON from LLMs in Angular https://www.munonye.com/angular-function-calling-structured-json-llms/ Full tutorial: RAG with Spring Boot — Embeddings and Vector Search Step by Step 2026 https://www.munonye.com/rag-spring-boot-embeddings-vector-search-step-by-step/