LangGraph Multi-Agent Systems: From One Brain to Many

LangGraph introduces multi-agent systems to solve cognitive overload, sequential bottlenecks, and complexity management in AI applications. The supervisor-specialist pattern assigns a central agent to route tasks to specialized agents, each with focused tools, improving performance and maintainability.

Part 5 of the LangGraph Mental Model series.For other parts of the series : What this article assumes:You understand the seven-module structure, can write a single-agent graph with tools and memory, and know how to pause and resume execution withinterrupt . Everything here builds on that foundation. We go from "why would I even use multiple agents?" all the way to a full multi-agent research assistant. Imagine you’ve built the single-agent assistant from Part 1. It works. Then your users start asking for more: “Can it also check emails, research topics, write reports, and manage tasks , all in one conversation?” So you add tools. More tools. More instructions in the system prompt. Suddenly your agent has 15 tools and a 2,000-word system prompt, and its performance quietly gets worse. The LLM gets confused about which tool to use when. It sometimes uses the email tool for research tasks and the research tool for email tasks. This is the cognitive overload problem , and it’s the primary reason multi-agent systems exist. The solution is the same one software engineers have used for decades: split the responsibility. Multi-agent systems in LangGraph solve three specific problems: Cognitive overload — one LLM performing too many unrelated tasks at once. Split into specialized agents, each excellent at one thing. Sequential bottlenecks — tasks that could run at the same time but are forced to run one after another. Parallel agents fix latency. Complexity management — a 40-node graph is impossible to reason about. Breaking it into smaller subgraphs makes each piece understandable and testable independently. Everything in this article addresses one of these three problems. This is the simplest multi-agent pattern and the right starting point. One supervisor agent receives the user’s request and decides which specialist agent should handle it. The specialist does the work and returns results to the supervisor, which then decides what to do next. Think of a law firm : a senior partner supervisor talks to the client, understands the need, and assigns work to junior associates specialists one for contracts, one for litigation, one for compliance. The junior associate does the deep work. The senior partner reviews and responds to the client. User Input ↓ Supervisor Node ← decides who handles what ↓ ↓ Specialist A Specialist B ← do the actual work ↓ ↓ Supervisor Node ← reviews results, responds or routes again ↓Response The key design decision here is next agent — a field the supervisor writes to, which the router reads. ── MODULE 2: STATE ─────────────────────────────────────────from typing import TypedDict, Annotated, Literalfrom langgraph.graph import MessagesStateclass SupervisorState MessagesState : The supervisor writes this field to indicate who should act next. "researcher", "writer", "FINISH" are the possible values. next agent: str Each specialist has its own, focused set of tools — no overlap. ── MODULE 3: TOOLS ─────────────────────────────────────────from langchain core.tools import tool Researcher's tools@tooldef search web query: str - str: """Search the web for factual information, news, or research.""" return f" Web results for: {query} "@tooldef search academic query: str - str: """Search academic papers and journals for scholarly sources.""" return f" Academic results for: {query} " Writer's tools@tooldef format as report content: str, title: str - str: """Format raw content into a structured report with headings.""" return f" {title}\n\n{content}"@tooldef check grammar text: str - str: """Check and correct grammar in the provided text.""" return f" Grammar-checked : {text}" Each specialist gets only its own toolsresearcher tools = search web, search academic writer tools = format as report, check grammar from langchain openai import ChatOpenAIllm = ChatOpenAI model="gpt-4o", temperature=0 researcher llm = llm.bind tools researcher tools writer llm = llm.bind tools writer tools ── MODULE 4: NODES ─────────────────────────────────────────from langchain core.messages import SystemMessage, HumanMessagefrom langgraph.prebuilt import ToolNode ── Supervisor ────────────────────────────────────────────SUPERVISOR PROMPT = """You are a supervisor coordinating a team of specialists.Given the conversation, decide who should act next.Your team:- researcher: finds information, searches web and academic sources- writer: formats content, writes reports, checks grammar- FINISH: use this when the task is complete and you have a final answerRespond with ONLY the name of the next agent: researcher, writer, or FINISH."""def supervisor node state: SupervisorState - dict: """Reads the conversation history and routes to the right specialist.""" messages = SystemMessage content=SUPERVISOR PROMPT + state "messages" response = llm.invoke messages The supervisor's plain text response IS the routing decision return {"next agent": response.content.strip } ── Researcher Specialist ──────────────────────────────────def researcher node state: SupervisorState - dict: """Specialist that handles all research and information-gathering tasks.""" messages = SystemMessage content="You are a research specialist. Use your search tools to find accurate, thorough information." + state "messages" response = researcher llm.invoke messages return {"messages": response } ── Writer Specialist ──────────────────────────────────────def writer node state: SupervisorState - dict: """Specialist that handles all writing, formatting, and editing tasks.""" messages = SystemMessage content="You are a writing specialist. Use your tools to produce well-structured, polished output." + state "messages" response = writer llm.invoke messages return {"messages": response } Tool execution nodes for each specialistresearcher tool node = ToolNode researcher tools writer tool node = ToolNode writer tools ── MODULE 5: ROUTING ────────────────────────────────────────def route after supervisor state: SupervisorState - Literal "researcher", "writer", " end " : """Read the supervisor's decision and route accordingly.""" decision = state "next agent" if decision == "FINISH": return " end " return decision "researcher" or "writer"def route researcher state: SupervisorState - Literal "researcher tools", "supervisor" : """After researcher acts: did it call a tool, or produce a final answer?""" last = state "messages" -1 if hasattr last, "tool calls" and last.tool calls: return "researcher tools" return "supervisor" Research done - report back to supervisordef route writer state: SupervisorState - Literal "writer tools", "supervisor" : last = state "messages" -1 if hasattr last, "tool calls" and last.tool calls: return "writer tools" return "supervisor" ── MODULE 6: GRAPH ASSEMBLY ────────────────────────────────from langgraph.graph import StateGraph, START, ENDfrom langgraph.checkpoint.memory import MemorySavergraph builder = StateGraph SupervisorState Register all nodesgraph builder.add node "supervisor", supervisor node graph builder.add node "researcher", researcher node graph builder.add node "researcher tools", researcher tool node graph builder.add node "writer", writer node graph builder.add node "writer tools", writer tool node Entry: always start at supervisorgraph builder.add edge START, "supervisor" Supervisor routes to specialist or endsgraph builder.add conditional edges "supervisor", route after supervisor, {"researcher": "researcher", "writer": "writer", " end ": END} Researcher loop: tools → researcher → supervisorgraph builder.add conditional edges "researcher", route researcher, {"researcher tools": "researcher tools", "supervisor": "supervisor"} graph builder.add edge "researcher tools", "researcher" Writer loop: tools → writer → supervisorgraph builder.add conditional edges "writer", route writer, {"writer tools": "writer tools", "supervisor": "supervisor"} graph builder.add edge "writer tools", "writer" graph = graph builder.compile checkpointer=MemorySaver The key structure here is the return-to-supervisor edge. After every specialist completes its work, it reports back to the supervisor. The supervisor then evaluates the full conversation and decides: assign to another specialist, loop the same specialist, or finish. This is the orchestration pattern at its most fundamental. The supervisor pattern is sequential — one specialist works at a time. But many real tasks have independent sub-tasks that can run simultaneously. Parallelization fixes the latency problem. Think of cooking a meal : you don’t finish the salad, then start the pasta, then start the sauce. You prep the salad, put the pasta water on, and start the sauce — all at the same time. The meal finishes faster because independent tasks run in parallel. In LangGraph, parallelism is created by a single node having multiple outgoing edges to different nodes. LangGraph detects this “fan-out” pattern and runs all the destination nodes concurrently in what it calls a superstep . The results are collected before any subsequent node runs “fan-in” . When multiple nodes run in the same superstep and try to write to the same state field, LangGraph needs to know how to combine those writes. Without a reducer, it throws an error. With operator.add, it safely concatenates the results. python ── MODULE 2: STATE FOR PARALLEL PATTERN ────────────────────import operatorfrom typing import Annotatedclass ResearchState TypedDict : query: str Without Annotated + operator.add, parallel writes to 'context' would CRASH. With it, LangGraph concatenates results from all parallel nodes. context: Annotated list str , operator.add final answer: str ── MODULE 4: PARALLEL WORKER NODES ─────────────────────────def web search node state: ResearchState - dict: """Runs in parallel with academic search node.""" result = f" WEB Results for: {state 'query' }" return {"context": result } Returns a LIST - operator.add will concatenatedef academic search node state: ResearchState - dict: """Runs in parallel with web search node.""" result = f" ACADEMIC Papers about: {state 'query' }" return {"context": result } Also a LISTdef synthesize node state: ResearchState - dict: """Runs AFTER both parallel nodes complete fan-in point . state 'context' already contains results from BOTH parallel nodes.""" combined = "\n".join state "context" prompt = SystemMessage content="Synthesize the following research into a clear answer." , HumanMessage content=f"Research:\n{combined}\n\nQuery: {state 'query' }" response = llm.invoke prompt return {"final answer": response.content} ── MODULE 6: PARALLEL GRAPH ASSEMBLY ───────────────────────from langgraph.graph import StateGraph, START, ENDgraph builder = StateGraph ResearchState graph builder.add node "web search", web search node graph builder.add node "academic search", academic search node graph builder.add node "synthesize", synthesize node Fan-out: START branches to BOTH search nodes simultaneouslygraph builder.add edge START, "web search" graph builder.add edge START, "academic search" Fan-in: BOTH search nodes must complete before synthesize runsgraph builder.add edge "web search", "synthesize" graph builder.add edge "academic search", "synthesize" graph builder.add edge "synthesize", END parallel graph = graph builder.compile The fan-out is just two edges from the same source START . LangGraph automatically detects this and runs both destinations concurrently. The fan-in happens naturally — synthesize can only run once both of its incoming edges are satisfied i.e., both parallel nodes have completed . Parallelization with flat nodes works well for simple cases. But when each parallel “branch” is itself complex — with multiple steps, its own tools, its own logic — you need subgraphs . A subgraph is a fully compiled StateGraph that runs as a single node inside a parent graph. Think of departments in a company . The CEO parent graph delegates work to the Engineering department and the Marketing department. Each department has its own internal processes, meetings, and workflows. The CEO doesn’t care about the internals — they just hand off work and receive deliverables. Subgraphs communicate with their parent graph through shared state keys. Any key that appears in both the parent’s state and the subgraph’s state is automatically passed in when the subgraph starts and passed back out when the subgraph finishes . Parent State: {topic, cleaned data, report A, report B, processed ids} ↓Subgraph A State: {cleaned data, internal step 1, report A, processed ids} ← reads 'cleaned data' from parent → writes 'report A' and 'processed ids' back to parentSubgraph B State: {cleaned data, internal step 2, report B, processed ids} ← reads 'cleaned data' from parent → writes 'report B' and 'processed ids' back to parent Keys that are only in the subgraph’s state like internal step 1 are private — the parent never sees them. ── MODULE 2: STATE SCHEMAS Three Levels ────────────────── Parent state - what the orchestrating graph tracksclass ParentState TypedDict : raw input: str cleaned data: str passed INTO both subgraphs failure summary: str written back by subgraph A performance report: str written back by subgraph B processed ids: Annotated list int , operator.add BOTH subgraphs write this Subgraph A's internal state - includes its private fields + shared keysclass FailureAnalysisState TypedDict : cleaned data: str shared input from parent failures: list str private intermediate state - parent never sees this failure summary: str shared output to parent processed ids: Annotated list int , operator.add shared output accumulated Subgraph A's output schema - filters which keys return to parent Without this, ALL keys return, which can cause conflictsclass FailureAnalysisOutput TypedDict : failure summary: str only these two keys go back to parent processed ids: Annotated list int , operator.add Subgraph B's internal stateclass PerformanceAnalysisState TypedDict : cleaned data: str metrics: list str private performance report: str shared output processed ids: Annotated list int , operator.add class PerformanceAnalysisOutput TypedDict : performance report: str processed ids: Annotated list int , operator.add ── SUBGRAPH A: Failure Analysis ─────────────────────────────def extract failures state: FailureAnalysisState - dict: failures = f"Error in {state 'cleaned data' }: timeout" return {"failures": failures, "processed ids": 1 }def summarize failures state: FailureAnalysisState - dict: summary = f"Found {len state 'failures' } failure s : {state 'failures' }" return {"failure summary": summary}failure builder = StateGraph FailureAnalysisState, output=FailureAnalysisOutput failure builder.add node "extract failures", extract failures failure builder.add node "summarize failures", summarize failures failure builder.add edge START, "extract failures" failure builder.add edge "extract failures", "summarize failures" failure builder.add edge "summarize failures", END failure subgraph = failure builder.compile compiled → can be used as a node ── SUBGRAPH B: Performance Analysis ────────────────────────def collect metrics state: PerformanceAnalysisState - dict: metrics = f"Latency: 120ms", "Uptime: 99.8%" return {"metrics": metrics, "processed ids": 2 }def write report state: PerformanceAnalysisState - dict: report = f"Performance Report:\n" + "\n".join state "metrics" return {"performance report": report}performance builder = StateGraph PerformanceAnalysisState, output=PerformanceAnalysisOutput performance builder.add node "collect metrics", collect metrics performance builder.add node "write report", write report performance builder.add edge START, "collect metrics" performance builder.add edge "collect metrics", "write report" performance builder.add edge "write report", END performance subgraph = performance builder.compile ── PARENT GRAPH ─────────────────────────────────────────────def clean data node state: ParentState - dict: return {"cleaned data": state "raw input" .strip .lower }def combine reports node state: ParentState - dict: combined = f"{state 'failure summary' }\n\n{state 'performance report' }" print f"Final combined report:\n{combined}" return {}parent builder = StateGraph ParentState parent builder.add node "clean data", clean data node parent builder.add node "failure analysis", failure subgraph subgraph as nodeparent builder.add node "performance analysis", performance subgraph subgraph as nodeparent builder.add node "combine reports", combine reports node parent builder.add edge START, "clean data" Fan-out to both subgraphs parallel execution parent builder.add edge "clean data", "failure analysis" parent builder.add edge "clean data", "performance analysis" Fan-in: combine only after both subgraphs completeparent builder.add edge "failure analysis", "combine reports" parent builder.add edge "performance analysis", "combine reports" parent builder.add edge "combine reports", END parent graph = parent builder.compile The output schema FailureAnalysisOutput, PerformanceAnalysisOutput is the piece most tutorials skip — and then they wonder why they get state key conflicts. The output schema acts as a filter, controlling exactly which fields the subgraph exposes to the parent. Any field not in the output schema is treated as private internal state. Parallelization and subgraphs cover the case where you know at design time how many parallel branches you’ll have. But what if you don’t? What if the user asks to research 5 topics today and 20 topics tomorrow? You can’t hard-code 20 parallel branches. This is the map-reduce pattern, powered by LangGraph’s Send API. Think of a book publisher assigning chapters : one editor is given the manuscript. They split it into chapters and assign each chapter to a different copy editor simultaneously. However many chapters there are — five, twenty, two — that’s how many copy editors get hired. When they’re all done, the results are collected and assembled into the final book. The number of parallel workers is determined at runtime , not at design time. Send node name, state — from langgraph.constants. Instead of routing to a fixed next node, Send creates a new instance of a node with a specific state payload. Return a list of Send objects from a routing function, and LangGraph launches all of them in parallel, each with its own independent state. operator.add on the collecting field — the "reduce" step. As each parallel worker finishes and returns its result, operator.add accumulates them into a growing list in the parent state. Worker state — each Send can include a custom state dict that doesn't have to match the parent graph's state. The worker node uses its own small local state — just the data it needs for its specific piece of work. ── MODULE 2: MAP-REDUCE STATE ──────────────────────────────from langgraph.constants import Sendclass OverallState TypedDict : topic: str subjects: list str populated by the map step jokes: Annotated list str , operator.add accumulated by the reduce step best joke: strclass JokeState TypedDict : """The private state each parallel worker gets. Notice: this does NOT need to match OverallState.""" subject: str ── MODULE 4: MAP-REDUCE NODES ──────────────────────────────def generate subjects state: OverallState - dict: """MAP step 1: Expands the topic into a list of subjects to process.""" prompt = SystemMessage content="Generate a list of 3 sub-topics related to the given topic. Return as comma-separated values only." , HumanMessage content=state "topic" response = llm.invoke prompt subjects = s.strip for s in response.content.split "," return {"subjects": subjects}def generate joke state: JokeState - dict: """MAP step 2: Worker node. Each instance handles exactly ONE subject. Receives its own isolated JokeState - not the full OverallState.""" prompt = HumanMessage content=f"Write a short, funny joke about: {state 'subject' }" response = llm.invoke prompt Returns a LIST - operator.add in OverallState will accumulate these return {"jokes": response.content }def pick best joke state: OverallState - dict: """REDUCE step: All jokes are now in state 'jokes' . Pick the winner.""" jokes text = "\n".join f"{i}. {j}" for i, j in enumerate state "jokes" prompt = SystemMessage content="You are a comedy judge. Pick the funniest joke from the list below. Reply with ONLY the number." , HumanMessage content=jokes text response = llm.invoke prompt winning index = int response.content.strip return {"best joke": state "jokes" winning index } ── MODULE 5: THE SEND ROUTER ────────────────────────────────def continue to jokes state: OverallState : """The routing function that LAUNCHES parallel workers. Returning a LIST of Send objects not a string is what triggers map-reduce. Each Send creates an independent, parallel execution of 'generate joke' with its own state payload. The number of parallel workers is determined at runtime by how many subjects exist in state. """ return Send "generate joke", {"subject": subject} for subject in state "subjects" ── MODULE 6: MAP-REDUCE GRAPH ASSEMBLY ─────────────────────graph builder = StateGraph OverallState graph builder.add node "generate subjects", generate subjects graph builder.add node "generate joke", generate joke graph builder.add node "pick best joke", pick best joke graph builder.add edge START, "generate subjects" This conditional edge uses the Send router - it fans out dynamicallygraph builder.add conditional edges "generate subjects", continue to jokes, returns a LIST of Send objects "generate joke" list of possible destination nodes for graph validation Fan-in: pick best joke runs after ALL parallel generate joke instances completegraph builder.add edge "generate joke", "pick best joke" graph builder.add edge "pick best joke", END map reduce graph = graph builder.compile The continue to jokes function is where the magic happens. Notice that it returns a list of Send objects, not a string. When LangGraph sees a list of Send objects from a routing function, it launches all of them in parallel immediately. If state "subjects" has 3 items, 3 parallel generate joke nodes launch. If it has 20, 20 launch. The graph scales dynamically to whatever runtime produces. Now we combine everything: supervisor orchestration, HITL approval Part 3 , parallel subgraphs, and the Send API — into one complete production-grade system. This is the LangGraph Academy's capstone project, annotated and explained. The system: A user provides a research topic. The system generates a team of AI analyst personas with human approval to refine them , then runs each analyst as a parallel interview sub-agent. Each analyst interviews an AI “expert” using web search, producing a report section. Finally, a writer node compiles all sections into a final polished report. User provides topic ↓ create analysts → LLM generates N analyst personas ↓ human feedback ← INTERRUPT: human approves or sends feedback ↓ ↑ should continue → if feedback, loop back to create analysts ↓ initiate research → Send API: spawns N parallel interview subgraphs ↓↓↓ N parallel workers running simultaneously interview subgraph × N → each analyst interviews an "expert" via web search ↓ write report → LLM compiles all sections into a final report python from pydantic import BaseModel, Field ── Analyst persona - a Pydantic model for structured LLM output ──class Analyst BaseModel : name: str = Field description="The analyst's full name" role: str = Field description="Their professional role e.g. 'Financial Analyst' " focus: str = Field description="Their specific analytical focus area" @property def persona self - str: return f"Name: {self.name}\nRole: {self.role}\nFocus: {self.focus}"class Perspectives BaseModel : analysts: list Analyst ── Outer graph state ─────────────────────────────────────────class ResearchGraphState TypedDict : topic: str max analysts: int human analyst feedback: str human writes here during HITL pause analysts: list Analyst sections: Annotated list, operator.add accumulated from all parallel interviews final report: str ── Interview subgraph state ──────────────────────────────────class InterviewState MessagesState : inherits: messages: Annotated list BaseMessage , add messages max num turns: int context: Annotated list, operator.add search results accumulate here analyst: Analyst interview: str formatted transcript sections: list completed section sent back to outer graph ── ANALYST GENERATION NODE ───────────────────────────────────ANALYST PROMPT = """You are generating a team of {max analysts} research analysts for the topic: {topic}.{feedback section}Each analyst should have a unique perspective and area of focus.Generate analysts with diverse viewpoints - financial, technical, social, etc."""def create analysts state: ResearchGraphState - dict: """Generates analyst personas using structured LLM output.""" feedback = state.get "human analyst feedback", "" feedback section = f"User feedback to incorporate:\n{feedback}" if feedback else "No feedback yet - generate your best initial set." prompt = ANALYST PROMPT.format max analysts=state "max analysts" , topic=state "topic" , feedback section=feedback section with structured output: forces LLM to return a Pydantic model instead of free-form text - ensures clean, usable data structured llm = llm.with structured output Perspectives result = structured llm.invoke SystemMessage content=prompt return {"analysts": result.analysts}def human feedback node state: ResearchGraphState - dict: """No-op node. Its only purpose is to be a pause point. The graph is compiled with interrupt before= 'human feedback' . See Part 3 for the full HITL pattern.""" pass Does nothing - the pause happens BEFORE this node via interrupt beforedef should continue to research state: ResearchGraphState - Literal "create analysts", "initiate research" : """After human feedback: loop back to refine, or proceed to interviews.""" if state.get "human analyst feedback" : return "create analysts" return "initiate research" Each analyst runs as an independent subgraph. The subgraph is a mini ReAct agent that interviews an “expert” another LLM call that plays the expert role and searches the web. python from langchain core.tools import toolfrom langgraph.prebuilt import ToolNode@tooldef web search tool query: str - str: """Search the web for information on a topic.""" return f" Search results for '{query}' " Replace with real search APIinterview tools = web search tool interview llm = llm.bind tools interview tools def generate question state: InterviewState - dict: """The analyst node: generates the next interview question.""" analyst = state "analyst" messages = SystemMessage content= f"You are {analyst.name}, a {analyst.role} focused on {analyst.focus}. " f"You are interviewing an expert about the research topic. " f"Ask insightful questions that align with your analytical focus. " f"When you have enough information, say 'Thank you, that is all I needed.'" + state "messages" response = interview llm.invoke messages return {"messages": response }def generate answer state: InterviewState - dict: """The expert node: answers the analyst's question, using web search.""" messages = SystemMessage content= "You are an expert being interviewed. Answer thoroughly and factually. " "Use the web search tool when you need current data." + state "messages" response = interview llm.invoke messages return {"messages": response }def save interview state: InterviewState - dict: """Formats the full Q&A exchange into a clean transcript string.""" transcript = for msg in state "messages" : if hasattr msg, "content" : role = "Analyst" if type msg . name == "HumanMessage" else "Expert" transcript.append f"{role}: {msg.content}" return {"interview": "\n\n".join transcript }def write section state: InterviewState - dict: """The final node: uses the interview transcript to write a report section.""" analyst = state "analyst" prompt = SystemMessage content= f"Based on the following interview conducted by {analyst.name} {analyst.role} , " f"write a structured report section focused on {analyst.focus}. " f"Be concise, factual, and cite specific points from the interview." , HumanMessage content=state "interview" response = llm.invoke prompt sections is the key that Send passes back to the outer graph return {"sections": response.content }def route interview state: InterviewState - Literal "generate answer", "save interview" : """Continue interviewing OR wrap up when analyst is satisfied or max turns hit.""" last message = state "messages" -1 if "thank you, that is all" in last message.content.lower or len state "messages" = state.get "max num turns", 6 2 : return "save interview" Check if analyst made a tool call web search if hasattr last message, "tool calls" and last message.tool calls: return "generate answer" return "generate answer" ── INTERVIEW SUBGRAPH ASSEMBLY ───────────────────────────────interview builder = StateGraph InterviewState interview tool node = ToolNode interview tools interview builder.add node "generate question", generate question interview builder.add node "generate answer", generate answer interview builder.add node "interview tools", interview tool node interview builder.add node "save interview", save interview interview builder.add node "write section", write section interview builder.add edge START, "generate question" interview builder.add conditional edges "generate question", route interview, {"generate answer": "generate answer", "save interview": "save interview"} interview builder.add conditional edges "generate answer", lambda s: "interview tools" if hasattr s "messages" -1 , "tool calls" and s "messages" -1 .tool calls else "generate question", {"interview tools": "interview tools", "generate question": "generate question"} interview builder.add edge "interview tools", "generate answer" interview builder.add edge "save interview", "write section" interview builder.add edge "write section", END interview graph = interview builder.compile ── SEND ROUTER: Spawn one interview per analyst ─────────────def initiate research state: ResearchGraphState : """Launches N parallel interview subgraphs via Send — one per analyst. This is the map-reduce Send pattern from Level 4, applied to subgraphs.""" return Send "interview", { "analyst": analyst, "messages": HumanMessage content=f"Research topic: {state 'topic' }" , "context": , "max num turns": 3, "interview": "", "sections": } for analyst in state "analysts" ── REPORT COMPILATION NODE ───────────────────────────────────def write report state: ResearchGraphState - dict: """Fan-in point: all sections have arrived from parallel interviews. The LLM compiles them into a final report.""" sections text = "\n\n---\n\n".join state "sections" prompt = SystemMessage content= "You are a senior editor. Compile the following report sections written by " "different analysts into one cohesive, well-structured final report. " "Add a proper introduction and conclusion. Do not repeat content." , HumanMessage content=f"Topic: {state 'topic' }\n\nSections:\n{sections text}" response = llm.invoke prompt return {"final report": response.content} ── OUTER GRAPH ASSEMBLY ─────────────────────────────────────outer builder = StateGraph ResearchGraphState outer builder.add node "create analysts", create analysts outer builder.add node "human feedback", human feedback node outer builder.add node "interview", interview graph subgraph as nodeouter builder.add node "write report", write report outer builder.add edge START, "create analysts" outer builder.add edge "create analysts", "human feedback" outer builder.add conditional edges "human feedback", should continue to research, {"create analysts": "create analysts", "initiate research": "initiate research"} initiate research is a ROUTING FUNCTION returns Send objects , not a nodeouter builder.add conditional edges "human feedback", initiate research, returns list of Send objects "interview" validation: lists possible destinations outer builder.add edge "interview", "write report" outer builder.add edge "write report", END research graph = outer builder.compile checkpointer=MemorySaver , interrupt before= "human feedback" HITL pause for analyst approval if name == " main ": config = {"configurable": {"thread id": "research-001"}} Step 1: Generate analysts result = research graph.invoke { "topic": "The impact of AI on the future of software engineering", "max analysts": 3, "human analyst feedback": "", "analysts": , "sections": , "final report": "" }, config=config Step 2: HITL — graph paused at human feedback node snapshot = research graph.get state config analysts = snapshot.values "analysts" print "Generated analysts:" for a in analysts: print f" - {a.name} {a.role} : {a.focus}" feedback = input "\nProvide feedback to refine analysts or press Enter to approve : " if feedback.strip : Human has feedback — inject it and let supervisor regenerate research graph.update state config, {"human analyst feedback": feedback}, as node="human feedback" else: Approved — clear feedback and proceed to interviews research graph.update state config, {"human analyst feedback": None}, as node="human feedback" Step 3: Resume — N parallel interviews launch via Send, then report is compiled print "\nRunning parallel interviews..." final result = research graph.invoke None, config=config print "\n" + "=" 60 print "FINAL RESEARCH REPORT" print "=" 60 print final result "final report" Use this to decide which pattern fits your situation: Single agent getting confused between tasks → Supervisor Pattern. Split work into specialized sub-agents. The supervisor routes; specialists execute. Tasks that are independent and slow → Parallelization Fan-out / Fan-in . Run them simultaneously. Add operator.add reducers for shared fields. Parallel branches that are themselves complex → Subgraphs. Compile each complex branch as its own graph and embed it as a node. Use output schemas to control what returns to the parent. Number of parallel workers determined at runtime → Send API Map-Reduce . Generate a list of Send objects from a router. Each Send spawns an independent worker with its own state payload. A real production system → All of the above, combined. The research assistant at Level 5 uses every pattern together because a real task requires all of them. This extends the keyword cards from Parts 1–3. Multi-Agent Structure Keywords supervisor node — the orchestrating node. Reads state, writes next agent, never does specialist work itself. specialist node — does one focused task with its own tools and system prompt. Always routes back to supervisor when done. next agent: str — the standard state field for supervisor routing. The supervisor writes a name; the router reads it. Parallelization Keywords Fan-out — multiple add edge calls from the same source node. LangGraph detects this and runs targets concurrently. Fan-in — multiple add edge calls pointing to the same destination. Destination waits for all sources to complete. Annotated list, operator.add — mandatory on any state field that multiple parallel nodes write to. Without this, parallel writes crash. operator.add — the most common reducer for parallel patterns. Concatenates lists from concurrent nodes. Subgraph Keywords StateGraph InternalState, output=OutputSchema — the two-argument form of StateGraph. The second argument filters which keys are returned to the parent. Output schema — a TypedDict with only the keys the subgraph should expose to the parent. Keys absent from this schema are private to the subgraph. Overlapping keys — the communication channel between parent and subgraph. Any key in both state schemas is automatically shared. subgraph.compile — seals the subgraph into a callable. After this, pass it to parent.add node "name", compiled subgraph . xray=1 — argument to graph.get graph xray=1 .draw mermaid — visualizes internal subgraph structure in the parent graph diagram. Map-Reduce / Send Keywords Send node name, state dict — from langgraph.constants. Represents a single parallel worker invocation. Does not need to match parent graph state. Send "node", {...} , Send "node", {...} , ... — returning a list of Send objects from a routing function launches all of them in parallel. "node name" — the third argument to add conditional edges when using Send. A list not a dict of valid destination node names, for graph validation only. with structured output PydanticModel — chains after an LLM to force structured JSON output that validates against a Pydantic schema. The standard pattern for supervisor decisions and analyst generation. The jump from a single agent to a multi-agent system isn’t really a jump at all — it’s the same seven modules, applied multiple times and wired together. Every “agent” in a multi-agent system is just a graph, or a node, or a subgraph. The primitives don’t change. The skill is knowing how to compose them. The progression in this article followed a deliberate staircase: one supervisor, then parallel flat nodes, then complex parallel subgraphs, then dynamic parallelism with Send, then the full combination. Each step added exactly one new concept. If any step felt comfortable, that's the design working — each level builds cleanly on the one before. The research assistant at Level 5 is genuinely close to what you’d find in a production multi-agent codebase. Human approval loops, parallel specialist agents, structured LLM output, dynamic worker spawning, and a final synthesis step. You now have a mental model and a working template for all of it. With all four parts complete, you have the full production scaffold: canonical structure Part 0 + memory management Part 1 + human-in-the-loop safety Part 2 + multi-agent orchestration Part 3 . These four articles together cover the architecture behind the vast majority of real-world LangGraph applications. For other parts of the series : Part 0 , Part 1 , Part 2 , Part 3 , Part 4 . LangGraph Multi-Agent Systems: From One Brain to Many https://pub.towardsai.net/langgraph-multi-agent-systems-from-one-brain-to-many-4c1773055693 was originally published in Towards AI https://pub.towardsai.net on Medium, where people are continuing the conversation by highlighting and responding to this story.