RAG vs Agentic AI: A Developer's Decision Tree (With Code Examples for Both)

A developer provides a decision tree and code examples to distinguish between RAG (Retrieval-Augmented Generation) and agentic AI architectures. RAG is recommended for answering questions from documents, while agents are suited for taking actions across multiple systems. The post includes working Python code for both approaches using LangChain and Anthropic's Claude.

Two different problems wearing similar clothes. Here's how to tell them apart in thirty seconds, with working code for both. I see this confusion in almost every project kickoff: "We need RAG" when the actual requirement is agentic, or "we need an agent" when RAG would be simpler, cheaper, and faster to ship. Let's fix that with a decision tree you can actually use, plus working code for each path. Does your system need to ANSWER QUESTIONS from documents? ├── YES, and that's the whole job → RAG └── YES, but it also needs to TAKE ACTIONS across systems └── → Agent that uses RAG as a tool Does your system need to TAKE ACTIONS across multiple systems? ├── YES, with no document retrieval needed → Plain Agent └── YES, and it needs grounded knowledge from documents → → Agent that uses RAG as a tool The test question that resolves most confusion: "Does this system need to decide what to do, or does it need to find and synthesise information?" Finding and synthesising → RAG. Deciding and acting → agent. RAG is the right architecture when your job is grounding LLM responses in a specific document set, answering questions, summarising content, finding relevant passages. python from langchain.text splitter import RecursiveCharacterTextSplitter from langchain.embeddings import HuggingFaceEmbeddings from langchain.vectorstores import Chroma from langchain.chains import RetrievalQA from langchain anthropic import ChatAnthropic 1. Load and chunk documents splitter = RecursiveCharacterTextSplitter chunk size=800, chunk overlap=100 chunks = splitter.split documents documents 2. Embed and store embeddings = HuggingFaceEmbeddings model name="sentence-transformers/all-mpnet-base-v2" vectorstore = Chroma.from documents chunks, embeddings 3. Build the retrieval chain llm = ChatAnthropic model="claude-sonnet-4-5" qa chain = RetrievalQA.from chain type llm=llm, chain type="stuff", retriever=vectorstore.as retriever search kwargs={"k": 4} , return source documents=True 4. Query result = qa chain {"query": "What is our refund policy for enterprise customers?"} print result "result" print result "source documents" Always show sources This is the whole job: retrieve relevant chunks, ground the LLM's answer in them, return a response with citations. No planning loop, no tool orchestration, no multi-step decision-making. If your use case stops here, building agent infrastructure on top of this is unnecessary complexity. An agent is right when the job is taking actions, checking systems, executing operations, making decisions that span multiple steps and there's no document knowledge base involved. python import anthropic client = anthropic.Anthropic tools = { "name": "check inventory", "description": "Check current stock level for a SKU", "input schema": { "type": "object", "properties": {"sku": {"type": "string"}}, "required": "sku" } }, { "name": "create purchase order", "description": "Create a PO with a supplier", "input schema": { "type": "object", "properties": { "supplier id": {"type": "string"}, "sku": {"type": "string"}, "quantity": {"type": "integer"} }, "required": "supplier id", "sku", "quantity" } } def run inventory agent goal: str - str: messages = {"role": "user", "content": goal} for in range 6 : response = client.messages.create model="claude-sonnet-4-5", max tokens=1500, tools=tools, messages=messages if response.stop reason == "end turn": return next b.text for b in response.content if hasattr b, 'text' messages.append {"role": "assistant", "content": response.content} tool results = for block in response.content: if block.type == "tool use": result = execute inventory tool block.name, block.input tool results.append { "type": "tool result", "tool use id": block.id, "content": result } messages.append {"role": "user", "content": tool results} return "Reached max iterations." run inventory agent "Check stock for SKU-4471. If below 50 units, " "create a PO with our primary supplier for 200 units." No documents involved. The agent checks inventory, reasons about the threshold, and conditionally creates a purchase order. This is pure action orchestration. This is where most real enterprise systems actually land: an agent that needs to take actions, and one of the things it needs to do along the way is look something up in a document knowledge base. php import anthropic client = anthropic.Anthropic def rag lookup query: str - str: """RAG retrieval wrapped as a tool the agent can call.""" result = qa chain {"query": query} the RAG chain from Path 1 return json.dumps { "answer": result "result" , "sources": doc.metadata.get "source" for doc in result "source documents" } tools = { "name": "search policy documents", "description": "Search company policy documents for relevant information", "input schema": { "type": "object", "properties": {"query": {"type": "string"}}, "required": "query" } }, { "name": "issue refund", "description": "Process a refund for a customer order", "input schema": { "type": "object", "properties": { "order id": {"type": "string"}, "amount": {"type": "number"} }, "required": "order id", "amount" } } def execute tool name: str, input data: dict - str: if name == "search policy documents": return rag lookup input data "query" elif name == "issue refund": return process refund input data "order id" , input data "amount" def run refund agent customer request: str - str: messages = {"role": "user", "content": customer request} for in range 6 : response = client.messages.create model="claude-sonnet-4-5", max tokens=1500, tools=tools, messages=messages if response.stop reason == "end turn": return next b.text for b in response.content if hasattr b, 'text' messages.append {"role": "assistant", "content": response.content} tool results = {"type": "tool result", "tool use id": block.id, "content": execute tool block.name, block.input } for block in response.content if block.type == "tool use" messages.append {"role": "user", "content": tool results} return "Reached max iterations." run refund agent "Customer wants a refund on order 8821 for $340. " "Check our refund policy first to see if this qualifies." The agent decides to call search policy documents to check eligibility before deciding whether to call issue refund . The RAG system is doing exactly what it's good at, grounded retrieval, but it's a tool in service of the agent's broader decision-making, not the entire system. RAG-only systems are cheaper to build and run. Single retrieval call, single generation call, predictable latency, easier to evaluate you can measure retrieval precision and answer accuracy independently . Agentic systems are more expensive and harder to debug. Multiple LLM calls per task, unpredictable latency depends how many iterations the agent takes , harder to evaluate because failure can happen at the planning stage or the execution stage. They're also the only option when the task genuinely requires multi-step action across systems. The mistake we see most often: teams building agentic infrastructure for what's fundamentally a question-answering problem, paying the complexity cost for capability they don't need. The full RAG vs agentic AI comparison covers the cost modelling, latency benchmarks, and evaluation methodology differences in more depth. Once you've picked your architecture, the next question is build vs buy, do you build this RAG pipeline or agent loop yourself, or do you use a managed platform? The answer depends on your timeline, your team's capacity, and how differentiated your specific use case actually is. We wrote the framework with cost models, time estimates, and decision criteria for exactly this question, worth reading before you commit engineering time to either path. Published by Dextra Labs | AI Consulting & Enterprise Agent Development