Build a Multi-Agent Research Pipeline with CrewAI and Ollama

wpnews.pro

Assemble a three-agent CrewAI crew backed by a locally running Llama 3.1 model to autonomously produce structured, cited research reports — no OpenAI key required.

Mariana Souza

What You'll Build #

A three-agent CrewAI pipeline that takes a research topic and produces a formatted Markdown report with sourced findings. A Researcher gathers facts via web search, an Analyst synthesizes them, and a Writer produces the final document. All inference runs locally through Ollama.

Prerequisites #

Python 3.10 or 3.11 (3.12 works; 3.9 does not) Ollama0.1.x or later installed- At least 8 GB of free RAM; 16 GB is comfortable for llama3.1:8b
macOS or Linux. On Windows, use WSL2.
A virtual environment tool ( venv

,conda

, etc.)

Step 1: Get Ollama Running with Llama 3.1 #

If Ollama isn't already running as a background service, start it first:

ollama serve &

On macOS with the app installed, Ollama starts at login automatically. Then pull the model:

ollama pull llama3.1:8b

Confirm it's listening before continuing:

curl http://localhost:11434/api/tags

You should get JSON listing your local models. If you get Connection refused

, Ollama isn't running yet.

Step 2: Install Python Dependencies #

python3 -m venv .venv
source .venv/bin/activate

pip install "crewai>=0.28.0" "langchain-ollama>=0.1.0" "langchain-community>=0.2.0" duckduckgo-search

langchain-ollama

is the standalone adapter split from langchain-community

in LangChain 0.2.x. Prefer it over the older langchain_community.chat_models.ChatOllama

import path. CrewAI requires Pydantic v2, so verify with pip show pydantic

if you're in a shared environment.

Step 3: Configure the LLM and Search Tool #

Create research_pipeline.py

. The LLM setup is a single object you'll hand to every agent:

from crewai import Agent, Task, Crew, Process
from langchain_ollama import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun

llm = ChatOllama(model="llama3.1:8b", temperature=0.1)
search_tool = DuckDuckGoSearchRun()

Low temperature keeps the researcher and analyst factual. Bump it to 0.3 on the writer if you want less-dry prose.

Step 4: Define the Three Agents #

researcher = Agent(
    role="Research Specialist",
    goal="Find accurate, recent information on the given topic and collect key facts with sources.",
    backstory=(
        "You are a meticulous researcher with a talent for locating credible sources "
        "and summarizing them without losing detail."
    ),
    llm=llm,
    tools=[search_tool],
    allow_delegation=False,
    verbose=True,
)

analyst = Agent(
    role="Data Analyst",
    goal="Identify patterns, gaps, and key insights from the research findings.",
    backstory=(
        "You specialize in transforming raw research into structured analysis, "
        "separating signal from noise."
    ),
    llm=llm,
    tools=[],
    allow_delegation=False,
    verbose=True,
)

writer = Agent(
    role="Technical Writer",
    goal="Produce a well-structured, cited research report suitable for a technical audience.",
    backstory=(
        "You write clear, authoritative reports. You cite sources precisely "
        "and never pad content with filler."
    ),
    llm=llm,
    tools=[],
    allow_delegation=False,
    verbose=True,
)

allow_delegation=False

prevents agents from spontaneously reassigning work mid-run. In a sequential pipeline it mostly avoids confusion rather than wasted inference.

Step 5: Define Tasks with Explicit Context #

research_task = Task(
    description=(
        "Search for recent developments, key players, and real-world use cases for: {topic}. "
        "Collect at least five distinct facts and note the source URL for each."
    ),
    expected_output=(
        "A bullet-point list of findings, each with a source URL. "
        "Minimum five items, no speculation."
    ),
    agent=researcher,
)

analysis_task = Task(
    description=(
        "Review the research findings and identify: (1) the three most significant trends, "
        "(2) any contradictions or gaps, and (3) practical implications."
    ),
    expected_output=(
        "A structured analysis in three labelled sections: Trends, Gaps, Implications. "
        "Each section contains 2-3 concise paragraphs."
    ),
    agent=analyst,
    context=[research_task],  # analyst receives researcher's full output
)

writing_task = Task(
    description=(
        "Write a research report on {topic} using the findings and analysis provided. "
        "Include: Executive Summary, Findings, Analysis, and Conclusion. "
        "Cite sources inline."
    ),
    expected_output=(
        "A formatted Markdown report with H2 headings for each section, "
        "inline citations, and a References section at the end."
    ),
    agent=writer,
    context=[research_task, analysis_task],
)

The context

list is the key wiring. Without it, the analyst and writer only see their own task description, not the upstream output.

Step 6: Assemble the Crew and Run #

crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, writing_task],
    process=Process.sequential,
    verbose=True,
)

if __name__ == "__main__":
    result = crew.kickoff(inputs={"topic": "post-quantum cryptography standardization"})
    print("\n=== FINAL REPORT ===\n")
    print(result)

kickoff(inputs=...)

interpolates {topic}

into every task description at runtime. Change the topic string without touching the pipeline definition.

python research_pipeline.py

Expect 5-15 minutes depending on hardware. Each agent's reasoning steps scroll by as it works.

Verify It Works #

A successful run prints each agent's inner monologue, then === FINAL REPORT ===

followed by a Markdown document with H2 sections and a References block. If you see Action: duckduckgo_search

lines in the researcher's output, tool use is working correctly.

Quick check: search the report for at least one http

URL in the References section. If there are none, the researcher completed without invoking the search tool — see the first troubleshooting item below.

Troubleshooting #

Agent loops without producing a Final Answer Local models sometimes fail to emit the Final Answer:

token in the ReAct format CrewAI expects. Add max_iter=5

to the offending agent. If it persists, try mistral:7b

— different quantizations handle ReAct prompting differently, and some work significantly better than others out of the box.

Researcher never calls the search tool The model generated an answer from its weights rather than invoking the tool. Make the task description more directive: add "You MUST use the search tool for every fact." It's a prompt engineering issue, not a code bug.

DuckDuckGo raises an exception or returns empty results DuckDuckGo rate-limits aggressive scrapers. If you hit it repeatedly during testing, add time.sleep(2)

between runs. For production workloads, swap in from crewai_tools import SerperDevTool

(requires a free Serper API key) which is more reliable under repeated load.

Pydantic validation errors on Agent or Task construction CrewAI requires Pydantic v2. Run pip show pydantic

and confirm 2.x

. If something in your environment pins v1, create a fresh virtual environment rather than trying to coerce compatibility.

Next Steps #

Add memory=True

toCrew

with a local embeddings model to give agents persistent memory across runs. - Swap Process.sequential

forProcess.hierarchical

and passmanager_llm=llm

to let CrewAI dynamically assign tasks. - Explore crewai_tools

forWebsiteSearchTool

,PDFSearchTool

, andFileWriterTool

to write reports directly to disk. - Profile throughput with ollama ps

while the pipeline runs and comparellama3.1:8b

againstmistral:7b

on your hardware for the best speed/quality tradeoff.

Mariana Souza· Senior Editor

Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.

Discussion 0 #

No comments yet

Be the first to weigh in.

source & further reading

devclubhouse.com — original article Prompt Injection Is the Least of Your AI Security Problems Why Developers are Trading Obsidian for Agent-Native Markdown Wikis Apple Fast-Tracks M7 Silicon to Rewrite On-Device AI Limits