Build a Multi-Agent Research Pipeline with CrewAI and Ollama A three-agent CrewAI pipeline backed by a locally running Llama 3.1 model via Ollama autonomously produces structured, cited research reports without requiring an OpenAI key. The pipeline uses a Researcher, Analyst, and Writer agent to gather facts, synthesize insights, and format the final document. Build a Multi-Agent Research Pipeline with CrewAI and Ollama Assemble a three-agent CrewAI crew backed by a locally running Llama 3.1 model to autonomously produce structured, cited research reports โ€” no OpenAI key required. Mariana Souza https://www.devclubhouse.com/u/mariana souza What You'll Build A three-agent CrewAI pipeline that takes a research topic and produces a formatted Markdown report with sourced findings. A Researcher gathers facts via web search, an Analyst synthesizes them, and a Writer produces the final document. All inference runs locally through Ollama. Prerequisites - Python 3.10 or 3.11 3.12 works; 3.9 does not Ollama https://ollama.com/download 0.1.x or later installed- At least 8 GB of free RAM; 16 GB is comfortable for llama3.1:8b - macOS or Linux. On Windows, use WSL2. - A virtual environment tool venv , conda , etc. Step 1: Get Ollama Running with Llama 3.1 If Ollama isn't already running as a background service, start it first: Only needed on Linux or if you didn't install the macOS .dmg app ollama serve & On macOS with the app installed, Ollama starts at login automatically. Then pull the model: Downloads ~4.7 GB Q4 quantization ollama pull llama3.1:8b Confirm it's listening before continuing: curl http://localhost:11434/api/tags You should get JSON listing your local models. If you get Connection refused , Ollama isn't running yet. Step 2: Install Python Dependencies python3 -m venv .venv source .venv/bin/activate pip install "crewai =0.28.0" "langchain-ollama =0.1.0" "langchain-community =0.2.0" duckduckgo-search langchain-ollama is the standalone adapter split from langchain-community in LangChain 0.2.x. Prefer it over the older langchain community.chat models.ChatOllama import path. CrewAI requires Pydantic v2, so verify with pip show pydantic if you're in a shared environment. Step 3: Configure the LLM and Search Tool Create research pipeline.py . The LLM setup is a single object you'll hand to every agent: python from crewai import Agent, Task, Crew, Process from langchain ollama import ChatOllama from langchain community.tools import DuckDuckGoSearchRun base url defaults to http://localhost:11434 Only override if Ollama is on a different host or port llm = ChatOllama model="llama3.1:8b", temperature=0.1 search tool = DuckDuckGoSearchRun Low temperature keeps the researcher and analyst factual. Bump it to 0.3 on the writer if you want less-dry prose. Step 4: Define the Three Agents researcher = Agent role="Research Specialist", goal="Find accurate, recent information on the given topic and collect key facts with sources.", backstory= "You are a meticulous researcher with a talent for locating credible sources " "and summarizing them without losing detail." , llm=llm, tools= search tool , allow delegation=False, verbose=True, analyst = Agent role="Data Analyst", goal="Identify patterns, gaps, and key insights from the research findings.", backstory= "You specialize in transforming raw research into structured analysis, " "separating signal from noise." , llm=llm, tools= , allow delegation=False, verbose=True, writer = Agent role="Technical Writer", goal="Produce a well-structured, cited research report suitable for a technical audience.", backstory= "You write clear, authoritative reports. You cite sources precisely " "and never pad content with filler." , llm=llm, tools= , allow delegation=False, verbose=True, allow delegation=False prevents agents from spontaneously reassigning work mid-run. In a sequential pipeline it mostly avoids confusion rather than wasted inference. Step 5: Define Tasks with Explicit Context research task = Task description= "Search for recent developments, key players, and real-world use cases for: {topic}. " "Collect at least five distinct facts and note the source URL for each." , expected output= "A bullet-point list of findings, each with a source URL. " "Minimum five items, no speculation." , agent=researcher, analysis task = Task description= "Review the research findings and identify: 1 the three most significant trends, " " 2 any contradictions or gaps, and 3 practical implications." , expected output= "A structured analysis in three labelled sections: Trends, Gaps, Implications. " "Each section contains 2-3 concise paragraphs." , agent=analyst, context= research task , analyst receives researcher's full output writing task = Task description= "Write a research report on {topic} using the findings and analysis provided. " "Include: Executive Summary, Findings, Analysis, and Conclusion. " "Cite sources inline." , expected output= "A formatted Markdown report with H2 headings for each section, " "inline citations, and a References section at the end." , agent=writer, context= research task, analysis task , The context list is the key wiring. Without it, the analyst and writer only see their own task description, not the upstream output. Step 6: Assemble the Crew and Run crew = Crew agents= researcher, analyst, writer , tasks= research task, analysis task, writing task , process=Process.sequential, verbose=True, if name == " main ": result = crew.kickoff inputs={"topic": "post-quantum cryptography standardization"} print "\n=== FINAL REPORT ===\n" print result kickoff inputs=... interpolates {topic} into every task description at runtime. Change the topic string without touching the pipeline definition. python research pipeline.py Expect 5-15 minutes depending on hardware. Each agent's reasoning steps scroll by as it works. Verify It Works A successful run prints each agent's inner monologue, then === FINAL REPORT === followed by a Markdown document with H2 sections and a References block. If you see Action: duckduckgo search lines in the researcher's output, tool use is working correctly. Quick check: search the report for at least one http URL in the References section. If there are none, the researcher completed without invoking the search tool โ€” see the first troubleshooting item below. Troubleshooting Agent loops without producing a Final Answer Local models sometimes fail to emit the Final Answer: token in the ReAct format CrewAI expects. Add max iter=5 to the offending agent. If it persists, try mistral:7b โ€” different quantizations handle ReAct prompting differently, and some work significantly better than others out of the box. Researcher never calls the search tool The model generated an answer from its weights rather than invoking the tool. Make the task description more directive: add "You MUST use the search tool for every fact." It's a prompt engineering issue, not a code bug. DuckDuckGo raises an exception or returns empty results DuckDuckGo rate-limits aggressive scrapers. If you hit it repeatedly during testing, add time.sleep 2 between runs. For production workloads, swap in from crewai tools import SerperDevTool requires a free Serper API key which is more reliable under repeated load. Pydantic validation errors on Agent or Task construction CrewAI requires Pydantic v2. Run pip show pydantic and confirm 2.x . If something in your environment pins v1, create a fresh virtual environment rather than trying to coerce compatibility. Next Steps - Add memory=True to Crew with a local embeddings model to give agents persistent memory across runs. - Swap Process.sequential for Process.hierarchical and pass manager llm=llm to let CrewAI dynamically assign tasks. - Explore crewai tools for WebsiteSearchTool , PDFSearchTool , and FileWriterTool to write reports directly to disk. - Profile throughput with ollama ps while the pipeline runs and compare llama3.1:8b against mistral:7b on your hardware for the best speed/quality tradeoff. Mariana Souza https://www.devclubhouse.com/u/mariana souza ยท Senior Editor Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon. Discussion 0 No comments yet Be the first to weigh in.