MCP for LangGraph Developers: From Basics to Production

The Model Context Protocol (MCP) solves the N×M integration problem by providing a universal standard for connecting AI applications to tools and data, allowing developers to build tools once as MCP servers and use them across any MCP-compatible application. This article introduces MCP from basics to production integration with LangGraph, using a hotel concierge analogy to explain the roles of host, client, and server.

Part 6 of the LangGraph Mental Model series, a ground-up introduction to the Model Context Protocol, building toward full integration with everything from Parts 1–6 For other parts of the series : Part 0 , Part 1 , Part 2 , Part 3 , Part 4 , Part 5 https://medium.com/towards-artificial-intelligence/langgraph-multi-agent-systems-from-one-brain-to-many-4c1773055693 , What this article assumes:You’re comfortable with the seven-module LangGraph structure and the idea of a@tool-decorated function Part 1, Module 3 . That's the only prerequisite. MCP is a new piece of infrastructure, not a new way of thinking about graphs, so we start from zero on MCP itself, and only reconnect to LangGraph once the concept is solid. In Part 1, you learned to write tools like this: php @tooldef search web query: str - str: """Search the web for current information.""" return f"Search results for: {query}" This works great, until it doesn’t scale. Imagine you’re building five different agents across five different projects, and every single one needs a “search the company database” tool. With the pattern above, you’d write that tool five separate times, in five separate codebases, in whatever language each project happens to use. If the database schema changes, you update five places. If you want to share that tool with a teammate building a different kind of AI app, maybe not even using LangGraph, they can’t use your @tool function. It's tied to LangChain's Python ecosystem. This is the N×M integration problem : N different tools databases, APIs, file systems need to connect to M different AI applications your LangGraph agent, a teammate’s custom agent, Claude Desktop, an IDE assistant . Without a shared standard, you end up writing N×M custom integrations. The Model Context Protocol MCP solves this by being a universal, open standard for connecting AI applications to tools and data. Build a tool once, as an MCP server, and any MCP-compatible application, our LangGraph agent, Claude Desktop, Cursor, a teammate’s custom agent, can use it immediately, with zero custom integration code. Think of it as a USB-C port for AI applications: one standard connector, any compatible device on either end. This article takes you from “what even is MCP” to building and connecting a production-grade MCP server to a real LangGraph agent, in five levels, each adding exactly one new idea. Before any code, build the picture. Imagine a hotel concierge desk . The concierge your AI application doesn’t personally know how to book a restaurant, hail a cab, or arrange theater tickets. Instead, the concierge has a list of trusted local services , partners who specialize in each task. When a guest asks for a dinner reservation, the concierge picks up a dedicated phone line to the restaurant booking service, makes the request in a standard format both sides understand, and relays the answer back to the guest. MCP is that standard phone line and that standard request format. The AI application is the concierge. The MCP server is a specialized service a restaurant booker, a cab company, a ticket office . And MCP itself is the shared language and connection protocol that lets any concierge talk to any service, without the concierge needing to learn each service’s internal phone system from scratch. Every MCP system has exactly three participants. Get comfortable with these three words, they are the foundation of everything else in this article. Host :the AI application the end user actually interacts with. In our series so far, this is your LangGraph agent. Other real-world examples: Claude Desktop, Cursor, VS Code with an AI copilot. The host manages the conversation, decides when a tool is needed, and shows results to the user. Client :lives inside the host. Each client maintains a dedicated, one-to-one connection to exactly one server. If your host connects to three different MCP servers, it spins up three separate clients internally — one per server. This 1:1 mapping is a deliberate security boundary: a client for the file system server can’t accidentally leak data to the database server. Server :a lightweight, focused program that exposes specific capabilities. A server might wrap a database, a file system, a web search API, or your company’s internal CRM. Critically: an MCP server doesn’t know or care which host is calling it. The same server works for your LangGraph agent today and a completely different AI application tomorrow, unmodified. Every message between a client and a server is a JSON-RPC 2.0 message — a simple, well-established format with exactly three message types: Request :“please do this and tell me the result.” Always includes a unique id so the response can be matched back to it. {"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "search db", "arguments": {"query": "Q3 revenue"}}} Response — the answer to a request. Contains either a result success or an error failure , and the same id as the original request. {"jsonrpc": "2.0", "id": 1, "result": {"content": {"type": "text", "text": "Q3 revenue was $4.2M"} }} Notification :a one-way message that expects no response. Used for things like “my list of available tools just changed.” You will almost never write raw JSON-RPC by hand, the SDKs handle this entirely. But understanding that this is the wire format underneath everything demystifies a lot of what follows. It’s the same reason understanding HTTP underneath a REST API makes you a better API developer, even if you never write raw HTTP by hand. An MCP server exposes its capabilities through exactly three types of primitives. This is the single most important conceptual table in this article — almost every design decision when building a server comes down to picking the right one of these three. Primitive Purpose Real-world analogy Who decides to use it Tool Executes an action, may have side effects A POST endpoint The AI model decides Resource Provides read-only data A GET endpoint The application/user decides Prompt A reusable instruction template A pre-written form letter The user explicitly invokes it A tool is a function the LLM can decide to call during reasoning, exactly like the @tool-decorated functions from Part 1. The difference is where that function lives: instead of being defined inside your LangGraph codebase, it's defined inside an MCP server, and your LangGraph agent discovers and calls it over the protocol. Use a tool for anything that does something , runs a calculation, queries a database, sends an email, modifies a file. A resource is data the application can pull in to give the model context, without the model needing to “decide” to call it the way it does with a tool. Resources are addressed by a URI, similar to how a web page is addressed by a URL. Use a resource for anything that’s read-only context ,a configuration file, a document, a list of available data sources the model might want to know about before deciding what to search for. A prompt is a pre-written template that structures how a user or the host application kicks off a particular workflow. Unlike tools, prompts are typically user-invoked , not model-invoked — think of them as the MCP equivalent of slash commands. Use a prompt for common, repeatable workflows ,“summarize this document,” “review this code for security issues” ,where you want consistent, well-engineered instructions every time, rather than relying on the user to phrase the request well. When you’re not sure which primitive to use for something you’re building, ask: Does this change anything, or just provide information? If it changes something creates, updates, deletes, sends, executes → Tool . If it’s read-only data the model might want as background → Resource . If it’s a reusable instruction template a user explicitly kicks off → Prompt . Time to write real code. We’ll use FastMCP , the dominant Python framework for building MCP servers, it turns the protocol’s considerable complexity JSON-RPC handling, schema generation, lifecycle management into a handful of decorators, in exactly the same spirit as how @tool simplified LangChain tool creation in Part 1. pip install fastmcp ── math server.py ───────────────────────────────────────────from fastmcp import FastMCP Create the server - this object is the container for everything belowmcp = FastMCP name="Math Server" @mcp.tooldef add a: int, b: int - int: """Add two numbers together.""" return a + b@mcp.tooldef multiply a: int, b: int - int: """Multiply two numbers together.""" return a bif name == " main ": mcp.run Defaults to stdio transport This is a complete, valid, runnable MCP server. Notice the parallels to Part 1’s @tool pattern: FastMCP reads your function's type hints to generate the input schema, and reads your docstring to generate the description the LLM sees. Exactly the same rule from Part 1 applies here with even higher stakes: a vague docstring means the model won't know when to call this tool, and now that vagueness affects every application that ever connects to this server, not just one LangGraph agent. ── Adding to math server.py ─────────────────────────────────@mcp.resource "config://server-info" def server info - dict: """Provides metadata about this server's capabilities.""" return {"version": "1.0", "operations": "add", "multiply" } Resources are addressed by a URI you define config://server-info here . When a client requests that URI, FastMCP calls your function and returns the result. The function only runs when requested — this is lazy evaluation, so you're not computing resource data the model never ends up using. Sometimes you want a resource whose content depends on an argument — like looking up one specific record rather than a fixed configuration blob. python @mcp.resource "history://{operation id}" def get calculation operation id: str - dict: """Retrieve a past calculation by its ID.""" In a real server, this would look up a database record return {"id": operation id, "result": "..."} The {operation id} placeholder in the URI makes this a resource template . When a client requests history://abc123, FastMCP extracts abc123 and passes it as the function argument automatically. php @mcp.promptdef explain calculation operation: str, result: str - str: """Generate a request to explain a calculation in plain English.""" return f"Explain how the operation '{operation}' produced the result '{result}', in simple terms a beginner could follow." python math server.py Before connecting this to any real AI application, you should verify it works using the MCP Inspector — a browser-based debugging tool that ships with the ecosystem and lets you list tools, call them manually, and inspect resources, all without needing an LLM in the loop. fastmcp dev math server.py This opens a local web interface typically http://127.0.0.1:6274 where you can click through your tools, fill in test arguments, and see exactly what JSON-RPC messages flow back and forth. Always test here first. Debugging a broken tool through an LLM's tool-calling behavior is much harder than debugging it directly in the Inspector. So far, our server has run with the default transport. Now we need to understand the choice, because it directly determines how your LangGraph agent will connect to it. Think of transport as the physical wire , separate from the language spoken over it . Two people can speak English over a telephone line or over a radio , same language, different wire. Similarly, MCP servers speak the same JSON-RPC 2.0 “language” regardless of transport , but how the bytes travel differs. if name == " main ": mcp.run stdio is the default With stdio standard input/output , the host application launches your server as a subprocess and communicates by writing to its stdin and reading from its stdout. There’s no network involved at all , it’s process-to-process communication on the same machine. Use stdio when: your server runs locally alongside the host a file system tool, a local script runner, a personal math utility . This is the simplest transport to set up and the most common for development and personal-use tools. Important constraint: when using stdio, your server’s stdout must be reserved exclusively for MCP protocol messages. If you print debug output to stdout, you will corrupt the protocol stream. Always log to stderr instead. if name == " main ": mcp.run transport="streamable-http", host="0.0.0.0", port=8000 With Streamable HTTP , your server runs as a standalone web service. Clients connect over the network using HTTP POST requests, with optional Server-Sent Events for streaming responses back. This is what makes a server genuinely remote, it can run on a different machine, in the cloud, behind a load balancer, serving many different host applications simultaneously. Use Streamable HTTP when: your server needs to be shared across a team, deployed centrally, or accessed by host applications running on different machines including a LangGraph agent running on a server, talking to a database-access MCP server running elsewhere . stdio : local tool, single user, simple setup, development and personal use. Streamable HTTP : shared tool, multiple users or applications, production deployment, needs authentication. This exact choice is what you’ll configure when connecting from LangGraph in the next level — every server in your config will declare itself as one or the other. This is where everything from this article meets everything from Parts 1–5. The bridge is a package called langchain-mcp-adapters , maintained by the LangChain team specifically to make MCP tools usable inside LangGraph agents. langchain-mcp-adapters does for MCP what the @tool decorator did for plain Python functions in Part 1: it converts something external an MCP tool into a BaseTool object that slots directly into a LangGraph ToolNode, with zero special-casing in your graph logic. Once converted, your agent doesn't know or care that a tool came from an MCP server rather than being written in-process , it's just a tool. pip install langchain-mcp-adapters langgraph "langchain openai " The central object is MultiServerMCPClient — it can connect to multiple MCP servers at once, mixing transports freely. ── MODULE 1 extended : MCP CLIENT CONFIGURATION ───────────from langchain mcp adapters.client import MultiServerMCPClientclient = MultiServerMCPClient { A local server, launched as a subprocess via stdio "math": { "command": "python", "args": "/absolute/path/to/math server.py" , "transport": "stdio", }, A remote server, already running, connected over HTTP "company data": { "url": "http://localhost:8000/mcp", "transport": "http", },} Each entry in this dictionary is one server. The key "math", "company data" is just a label you choose — it's how langchain-mcp-adapters keeps tools from different servers organized internally. Notice this directly mirrors the transport decision from Level 4: stdio servers get a command + args, HTTP servers get a url. This single call discovers every tool, on every configured server, and converts each one into a LangChain-compatible BaseTooltools = await client.get tools print t.name for t in tools 'add', 'multiply', 'search company db', ... This is the entire integration. client.get tools performs the MCP tools/list request against every server in your config, takes the returned schemas, and wraps each one into a tool object indistinguishable , from your graph's perspective — from a hand-written @tool function. From here, every module looks exactly like Part 1. This is the entire point of the integration: MCP changes where your tools come from, not how your graph is built. ============================================================ LANGGRAPH + MCP AGENT — COMPLETE TEMPLATE Extends: Part 1 core structure ============================================================ ── MODULE 1: IMPORTS & CONFIGURATION ───────────────────────import asynciofrom langchain mcp adapters.client import MultiServerMCPClientfrom langchain.chat models import init chat modelfrom langchain core.messages import HumanMessagefrom langgraph.graph import StateGraph, MessagesState, START, ENDfrom langgraph.prebuilt import ToolNode, tools conditionfrom langgraph.checkpoint.memory import MemorySaverllm = init chat model "openai:gpt-4o" client = MultiServerMCPClient { "math": { "command": "python", "args": "./math server.py" , "transport": "stdio", }, "company data": { "url": "http://localhost:8000/mcp", "transport": "http", },} ── MODULE 2: STATE ──────────────────────────────────────────class State MessagesState : pass ── MODULE 4: NODES ────────────────────────────────────────── Module 3, Tools, is now sourced from MCP rather than defined here def make agent node llm with tools : """Factory pattern: tools are loaded asynchronously, so the node closure is built after the async tool-loading step completes.""" def agent node state: State - dict: response = llm with tools.invoke state "messages" return {"messages": response } return agent node ── MODULE 6: GRAPH ASSEMBLY inside an async function ──────async def build graph : tools = await client.get tools llm with tools = llm.bind tools tools graph builder = StateGraph State graph builder.add node "agent", make agent node llm with tools graph builder.add node "tools", ToolNode tools graph builder.add edge START, "agent" graph builder.add conditional edges "agent", tools condition graph builder.add edge "tools", "agent" return graph builder.compile checkpointer=MemorySaver ── MODULE 7: ENTRYPOINT ──────────────────────────────────────async def main : graph = await build graph config = {"configurable": {"thread id": "session-001"}} response = await graph.ainvoke {"messages": HumanMessage content="What's 3 + 5 times 12?" }, config=config print response "messages" -1 .content if name == " main ": asyncio.run main This is the one genuine adjustment MCP introduces to the patterns from earlier parts: MCP communication is inherently asynchronous , because every tool call is, under the hood, a network or subprocess round-trip , even for the stdio transport. This means client.get tools must be awaited, your graph-building step typically lives inside an async def, and you invoke the compiled graph with ainvoke instead of invoke. Everything else, StateGraph, MessagesState, tools condition, MemorySaver, is identical to every prior article in this series. Notice Module 5 Routing effectively disappears from this template — we use tools condition, the pre-built router from Part 1, instead of writing a custom one. This isn't a new concept; it's the same tools condition from the very first canonical template, doing exactly the same job: check if the last message has tool calls, route to "tools" if so, otherwise END. MCP tools and hand-written tools are indistinguishable to this router, because they're both just BaseTool objects by the time your graph sees them. A few things separate a working demo from a system you’d actually deploy. These are the most commonly hit issues in real MCP + LangGraph systems. MultiServerMCPClient is stateless by default , every tool call opens a fresh connection, executes, and tears down. For a single request that's fine, but if your application serves many requests, opening a subprocess or HTTP connection on every single tool call adds real latency. In production, open the client once when your process starts and reuse it for the lifetime of the application, rather than recreating it per-request. Hold the connection open across the application's lifetime, rather than opening/closing per requestasync with MultiServerMCPClient {...} as client: tools = await client.get tools graph = await build graph with tools ... serve many requests using this same graph ... The stdio transport was designed for local, single-user, same-machine scenarios , like a desktop app launching a helper subprocess. If you’re deploying your LangGraph agent as a web service handling requests from many users, stdio servers become a liability: you’d be spawning subprocesses per request, with no natural way to share state or scale horizontally. For any server-side deployment, prefer Streamable HTTP, and genuinely ask whether you need a separate MCP server process at all — sometimes a plain @tool function calling an internal library directly is simpler and faster than going through the protocol. MCP tools are still just tools by the time they reach your graph , which means every pattern from Part 3 applies unchanged. If an MCP server exposes something sensitive sending emails, modifying a database, spending money , wrap it with the same interrupt review pattern from Part 3's review tool call node. MCP gives you discovery and standardization; it does not give you safety by default. That's still your job, using the patterns you already know. This is a structural fact worth internalizing: MCP servers run as separate processes . They cannot reach into your LangGraph state, your checkpointer, or any in-memory Python objects in your graph. If an MCP tool call needs to be personalized , using a stored user preference, for example , that information has to be passed explicitly as a tool argument, or handled through the more advanced interceptor pattern a middleware hook in langchain-mcp-adapters that lets you inspect and modify a tool call before it's sent, using your graph's runtime context . For most agents, simply passing the needed values as arguments is sufficient and far simpler. To show this genuinely composes with everything earlier in the series, here’s how MCP slots into the supervisor pattern from Part 4 : multiple specialist agents, each backed by its own dedicated MCP server. ── Specialist agents, each pulling tools from a DIFFERENT MCP server ──async def build multi agent system : client = MultiServerMCPClient { "research": {"url": "http://localhost:8001/mcp", "transport": "http"}, "code": {"url": "http://localhost:8002/mcp", "transport": "http"}, } research tools = await client.get tools server name="research" code tools = await client.get tools server name="code" researcher llm = llm.bind tools research tools coder llm = llm.bind tools code tools From here, wire researcher node, coder node, and a supervisor node exactly as shown in Part 4 - the rest of the supervisor pattern is completely unchanged. ... The takeaway: MCP doesn’t introduce a competing architecture to multi-agent systems. A specialist agent backed by an MCP server is structurally identical to a specialist agent backed by hand-written tools , the supervisor still routes the same way, the specialists still report back the same way. MCP simply changes where the tool definitions live and how many applications can reuse them. This extends the keyword cards from Parts 1–5 with MCP-specific terms. MCP Architecture Keywords Host — the AI application the user interacts with your LangGraph agent . Client — lives inside the host, one per connected server, manages a 1:1 session. Server — the standalone program exposing tools, resources, and prompts. Doesn't know which host is connecting. JSON-RPC 2.0 — the message format underneath every MCP exchange: Requests, Responses, Notifications. The Three Primitives Tool — an executable action, model decides when to call it. Built with @mcp.tool. Resource — read-only data, addressed by URI, application/user-controlled. Built with @mcp.resource "uri://..." . Prompt — a reusable instruction template, user-invoked. Built with @mcp.prompt. FastMCP Server Keywords FastMCP name="..." — creates the server instance, the container for all tools/resources/prompts. @mcp.tool — decorator that exposes a Python function as a callable tool; reads type hints for schema, docstring for description. @mcp.resource "scheme://{param}" — decorator for read-only data; {param} placeholders create a resource template. @mcp.prompt — decorator for a reusable prompt template. mcp.run — starts the server. Defaults to stdio; pass transport="streamable-http" for networked deployment. fastmcp dev server.py — launches the MCP Inspector for manual testing without an LLM. Transport Keywords stdio — subprocess-based, local, same-machine. The host launches and manages the server process. Never write debug output to stdout. streamable-http — networked, remote-capable, supports many concurrent host connections. The production default for shared servers. LangGraph Integration Keywords MultiServerMCPClient {...} — from langchain mcp adapters.client. Connects to one or more MCP servers, mixing transports freely in one config dict. await client.get tools — discovers and converts every tool from every configured server into LangChain-compatible BaseTool objects. tools condition — from Part 1 the pre-built LangGraph router; works identically whether tools are hand-written or MCP-sourced. Everything is async — client.get tools , graph building, and graph.ainvoke all require await, because MCP calls are network/subprocess round-trips even over stdio. You’re building a single tool for a single agent in a single codebase → You don’t need MCP. A plain @tool function from Part 1 is simpler and has zero protocol overhead. You’re building a tool that should be reusable across multiple agents, multiple projects, or shared with teammates using different frameworks → Build it as an MCP server. This is the core value proposition. You want to use one of the hundreds of pre-built community MCP servers GitHub, Slack, Postgres, Google Drive, Stripe, and more instead of writing integration code yourself → Connect via MultiServerMCPClient and skip writing that integration entirely. This is often the single biggest time-saver MCP offers. Your tool needs to run on a different machine from your agent, or be shared by multiple host applications at once → Use Streamable HTTP transport, deployed as a standalone service. Your tool only ever runs locally, alongside one agent, for development or personal use → stdio transport is sufficient, and simpler to set up. The instinct when learning a new protocol is to wonder whether it replaces what you already know. It doesn’t. MCP doesn’t replace LangGraph’s tools, nodes, or graphs — it replaces where those tools come from and how many places can reuse them . Once an MCP tool reaches your ToolNode, it behaves exactly like the hand-written tools from Part 1, gets approved by the same interrupt patterns from Part 3, and slots into the same supervisor architectures from Part 4. The progression in this article followed the same staircase as every other part of this series: concept and roles Level 1 , the three primitives Level 2 , a working server Level 3 , the transport choice Level 4 , the LangGraph bridge Level 5 , production hardening Level 6 , and composition with multi-agent systems Level 7 . Each level added exactly one idea on top of a stable foundation. You now have six articles’ worth of production scaffold: canonical structure Part 1 , memory management Part 2 , human-in-the-loop safety Part 3 , multi-agent orchestration Part 4 , real-world knowledge via RAG Part 5 , and standardized, shareable tooling via MCP Part 6 . Between them, these cover the overwhelming majority of what a serious, production-grade LangGraph application needs , and MCP is what lets the tooling half of that scale beyond any single codebase. For other parts of the series : Part 0 , Part 1 , Part 2 , Part 3 , Part 4 , Part 5 MCP for LangGraph Developers: From Basics to Production https://pub.towardsai.net/mcp-for-langgraph-developers-from-basics-to-production-12ff52df3d3c was originally published in Towards AI https://pub.towardsai.net on Medium, where people are continuing the conversation by highlighting and responding to this story.