MCP for LangGraph Developers: From Basics to Production

wpnews.pro

Part 6 of the LangGraph Mental Model series, a ground-up introduction to the Model Context Protocol, building toward full integration with everything from Parts 1–6

*For other parts of the series : Part 0 , Part 1 , Part 2 , Part 3 , Part 4 , * Part 5,

What this article assumes:You’re comfortable with the seven-module LangGraph structure and the idea of a@tool-decorated function (Part 1, Module 3). That's the only prerequisite. MCP is a new piece of infrastructure, not a new way of thinking about graphs, so we start from zero on MCP itself, and only reconnect to LangGraph once the concept is solid.

In Part 1, you learned to write tools like this:

@tooldef search_web(query: str) -> str:    """Search the web for current information."""    return f"Search results for: {query}"

This works great, until it doesn’t scale. Imagine you’re building five different agents across five different projects, and every single one needs a “search the company database” tool. With the pattern above, you’d write that tool five separate times, in five separate codebases, in whatever language each project happens to use. If the database schema changes, you update five places. If you want to share that tool with a teammate building a different kind of AI app, maybe not even using LangGraph, they can’t use your @tool function. It's tied to LangChain's Python ecosystem.

This is the N×M integration problem: N different tools (databases, APIs, file systems) need to connect to M different AI applications (your LangGraph agent, a teammate’s custom agent, Claude Desktop, an IDE assistant). Without a shared standard, you end up writing N×M custom integrations.

The Model Context Protocol (MCP) solves this by being a universal, open standard for connecting AI applications to tools and data. Build a tool once, as an MCP server, and any MCP-compatible application, our LangGraph agent, Claude Desktop, Cursor, a teammate’s custom agent, can use it immediately, with zero custom integration code. Think of it as a USB-C port for AI applications: one standard connector, any compatible device on either end.

This article takes you from “what even is MCP” to building and connecting a production-grade MCP server to a real LangGraph agent, in five levels, each adding exactly one new idea.

Before any code, build the picture. Imagine a hotel concierge desk. The concierge (your AI application) doesn’t personally know how to book a restaurant, hail a cab, or arrange theater tickets. Instead, the concierge has a list of trusted local services, partners who specialize in each task. When a guest asks for a dinner reservation, the concierge picks up a dedicated phone line to the restaurant booking service, makes the request in a standard format both sides understand, and relays the answer back to the guest.

MCP is that standard phone line and that standard request format. The AI application is the concierge. The MCP server is a specialized service (a restaurant booker, a cab company, a ticket office). And MCP itself is the shared language and connection protocol that lets any concierge talk to any service, without the concierge needing to learn each service’s internal phone system from scratch.

Every MCP system has exactly three participants. Get comfortable with these three words, they are the foundation of everything else in this article.

Host :the AI application the end user actually interacts with. In our series so far, this is your LangGraph agent. (Other real-world examples: Claude Desktop, Cursor, VS Code with an AI copilot.) The host manages the conversation, decides when a tool is needed, and shows results to the user.

Client :lives inside the host. Each client maintains a dedicated, one-to-one connection to exactly one server. If your host connects to three different MCP servers, it spins up three separate clients internally — one per server. This 1:1 mapping is a deliberate security boundary: a client for the file system server can’t accidentally leak data to the database server.

Server :a lightweight, focused program that exposes specific capabilities. A server might wrap a database, a file system, a web search API, or your company’s internal CRM. Critically: an MCP server doesn’t know or care which host is calling it. The same server works for your LangGraph agent today and a completely different AI application tomorrow, unmodified.

Every message between a client and a server is a JSON-RPC 2.0 message — a simple, well-established format with exactly three message types:

Request :“please do this and tell me the result.” Always includes a unique id so the response can be matched back to it.

{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "search_db", "arguments": {"query": "Q3 revenue"}}}

Response — the answer to a request. Contains either a result (success) or an error (failure), and the same id as the original request.

{"jsonrpc": "2.0", "id": 1, "result": {"content": [{"type": "text", "text": "Q3 revenue was $4.2M"}]}}

Notification :a one-way message that expects no response. Used for things like “my list of available tools just changed.”

You will almost never write raw JSON-RPC by hand, the SDKs handle this entirely. But understanding that this is the wire format underneath everything demystifies a lot of what follows. It’s the same reason understanding HTTP underneath a REST API makes you a better API developer, even if you never write raw HTTP by hand.

An MCP server exposes its capabilities through exactly three types of primitives. This is the single most important conceptual table in this article — almost every design decision when building a server comes down to picking the right one of these three.

Primitive Purpose Real-world analogy Who decides to use it Tool Executes an action, may have side effects A POST endpoint The AI model decides Resource Provides read-only data A GET endpoint The application/user decides Prompt A reusable instruction template A pre-written form letter The user explicitly invokes it

A tool is a function the LLM can decide to call during reasoning, exactly like the @tool-decorated functions from Part 1. The difference is where that function lives: instead of being defined inside your LangGraph codebase, it's defined inside an MCP server, and your LangGraph agent discovers and calls it over the protocol.

Use a tool for anything that does something , runs a calculation, queries a database, sends an email, modifies a file.

A resource is data the application can pull in to give the model context, without the model needing to “decide” to call it the way it does with a tool. Resources are addressed by a URI, similar to how a web page is addressed by a URL.

Use a resource for anything that’s read-only context ,a configuration file, a document, a list of available data sources the model might want to know about before deciding what to search for.

A prompt is a pre-written template that structures how a user (or the host application) kicks off a particular workflow. Unlike tools, prompts are typically user-invoked, not model-invoked — think of them as the MCP equivalent of slash commands.

Use a prompt for common, repeatable workflows ,“summarize this document,” “review this code for security issues” ,where you want consistent, well-engineered instructions every time, rather than relying on the user to phrase the request well.

When you’re not sure which primitive to use for something you’re building, ask: Does this change anything, or just provide information? If it changes something (creates, updates, deletes, sends, executes) → Tool. If it’s read-only data the model might want as background → Resource. If it’s a reusable instruction template a user explicitly kicks off → Prompt.

Time to write real code. We’ll use FastMCP, the dominant Python framework for building MCP servers, it turns the protocol’s considerable complexity (JSON-RPC handling, schema generation, lifecycle management) into a handful of decorators, in exactly the same spirit as how @tool simplified LangChain tool creation in Part 1.

pip install fastmcp

This is a complete, valid, runnable MCP server. Notice the parallels to Part 1’s @tool pattern: FastMCP reads your function's type hints to generate the input schema, and reads your docstring to generate the description the LLM sees. Exactly the same rule from Part 1 applies here with even higher stakes: a vague docstring means the model won't know when to call this tool, and now that vagueness affects every application that ever connects to this server, not just one LangGraph agent.

Resources are addressed by a URI you define (config://server-info here). When a client requests that URI, FastMCP calls your function and returns the result. The function only runs when requested — this is lazy evaluation, so you're not computing resource data the model never ends up using.

Sometimes you want a resource whose content depends on an argument — like looking up one specific record rather than a fixed configuration blob.

@mcp.resource("history://{operation_id}")def get_calculation(operation_id: str) -> dict:    """Retrieve a past calculation by its ID."""    # In a real server, this would look up a database record    return {"id": operation_id, "result": "..."}

The {operation_id} placeholder in the URI makes this a resource template. When a client requests history://abc123, FastMCP extracts abc123 and passes it as the function argument automatically.

@mcp.promptdef explain_calculation(operation: str, result: str) -> str:    """Generate a request to explain a calculation in plain English."""    return f"Explain how the operation '{operation}' produced the result '{result}', in simple terms a beginner could follow."
python math_server.py

Before connecting this to any real AI application, you should verify it works using the MCP Inspector — a browser-based debugging tool that ships with the ecosystem and lets you list tools, call them manually, and inspect resources, all without needing an LLM in the loop.

fastmcp dev math_server.py

This opens a local web interface (typically http://127.0.0.1:6274) where you can click through your tools, fill in test arguments, and see exactly what JSON-RPC messages flow back and forth. Always test here first. Debugging a broken tool through an LLM's tool-calling behavior is much harder than debugging it directly in the Inspector.

So far, our server has run with the default transport. Now we need to understand the choice, because it directly determines how your LangGraph agent will connect to it.

Think of transport as the physical wire, separate from the language spoken over it. Two people can speak English over a telephone line or over a radio , same language, different wire. Similarly, MCP servers speak the same JSON-RPC 2.0 “language” regardless of transport , but how the bytes travel differs.

if __name__ == "__main__":    mcp.run()  # stdio is the default

With stdio (standard input/output), the host application launches your server as a subprocess and communicates by writing to its stdin and reading from its stdout. There’s no network involved at all , it’s process-to-process communication on the same machine.

Use stdio when: your server runs locally alongside the host (a file system tool, a local script runner, a personal math utility). This is the simplest transport to set up and the most common for development and personal-use tools.

Important constraint: when using stdio, your server’s stdout must be reserved exclusively for MCP protocol messages. If you print() debug output to stdout, you will corrupt the protocol stream. Always log to stderr instead.

if __name__ == "__main__":    mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)

With Streamable HTTP, your server runs as a standalone web service. Clients connect over the network using HTTP POST requests, with optional Server-Sent Events for streaming responses back. This is what makes a server genuinely **remote, **it can run on a different machine, in the cloud, behind a load balancer, serving many different host applications simultaneously.

Use Streamable HTTP when: your server needs to be shared across a team, deployed centrally, or accessed by host applications running on different machines (including a LangGraph agent running on a server, talking to a database-access MCP server running elsewhere).

stdio: local tool, single user, simple setup, development and personal use. Streamable HTTP: shared tool, multiple users or applications, production deployment, needs authentication.

This exact choice is what you’ll configure when connecting from LangGraph in the next level — every server in your config will declare itself as one or the other.

This is where everything from this article meets everything from Parts 1–5. The bridge is a package called langchain-mcp-adapters, maintained by the LangChain team specifically to make MCP tools usable inside LangGraph agents.

langchain-mcp-adapters does for MCP what the @tool decorator did for plain Python functions in Part 1: it converts something external (an MCP tool) into a BaseTool object that slots directly into a LangGraph ToolNode, with zero special-casing in your graph logic. Once converted, your agent doesn't know or care that a tool came from an MCP server rather than being written in-process , it's just a tool.

pip install langchain-mcp-adapters langgraph "langchain[openai]"

The central object is MultiServerMCPClient — it can connect to multiple MCP servers at once, mixing transports freely.

Each entry in this dictionary is one server. The key ("math", "company_data") is just a label you choose — it's how langchain-mcp-adapters keeps tools from different servers organized internally. Notice this directly mirrors the transport decision from Level 4: stdio servers get a command + args, HTTP servers get a url.

This is the entire integration. client.get_tools() performs the MCP tools/list request against every server in your config, takes the returned schemas, and wraps each one into a tool object indistinguishable , from your graph's perspective — from a hand-written @tool function.

From here, every module looks exactly like Part 1. This is the entire point of the integration: MCP changes where your tools come from, not how your graph is built.

This is the one genuine adjustment MCP introduces to the patterns from earlier parts: MCP communication is inherently asynchronous, because every tool call is, under the hood, a network or subprocess round-trip , even for the stdio transport. This means client.get_tools() must be awaited, your graph-building step typically lives inside an async def, and you invoke the compiled graph with ainvoke instead of invoke. Everything else, StateGraph, MessagesState, tools_condition, MemorySaver, is identical to every prior article in this series.

Notice Module 5 (Routing) effectively disappears from this template — we use tools_condition, the pre-built router from Part 1, instead of writing a custom one. This isn't a new concept; it's the same tools_condition from the very first canonical template, doing exactly the same job: check if the last message has tool calls, route to "tools" if so, otherwise END. MCP tools and hand-written tools are indistinguishable to this router, because they're both just BaseTool objects by the time your graph sees them.

A few things separate a working demo from a system you’d actually deploy. These are the most commonly hit issues in real MCP + LangGraph systems.

MultiServerMCPClient is stateless by default , every tool call opens a fresh connection, executes, and tears down. For a single request that's fine, but if your application serves many requests, opening a subprocess or HTTP connection on every single tool call adds real latency. In production, open the client once when your process starts and reuse it for the lifetime of the application, rather than recreating it per-request.

The stdio transport was designed for local, single-user, same-machine scenarios , like a desktop app launching a helper subprocess. If you’re deploying your LangGraph agent as a web service handling requests from many users, stdio servers become a liability: you’d be spawning subprocesses per request, with no natural way to share state or scale horizontally. For any server-side deployment, prefer Streamable HTTP, and genuinely ask whether you need a separate MCP server process at all — sometimes a plain @tool function calling an internal library directly is simpler and faster than going through the protocol.

MCP tools are still just tools by the time they reach your graph , which means every pattern from Part 3 applies unchanged. If an MCP server exposes something sensitive (sending emails, modifying a database, spending money), wrap it with the same interrupt() review pattern from Part 3's review_tool_call node. MCP gives you discovery and standardization; it does not give you safety by default. That's still your job, using the patterns you already know.

This is a structural fact worth internalizing: MCP servers run as separate processes. They cannot reach into your LangGraph state, your checkpointer, or any in-memory Python objects in your graph. If an MCP tool call needs to be personalized , using a stored user preference, for example , that information has to be passed explicitly as a tool argument, or handled through the more advanced interceptor pattern (a middleware hook in langchain-mcp-adapters that lets you inspect and modify a tool call before it's sent, using your graph's runtime context). For most agents, simply passing the needed values as arguments is sufficient and far simpler.

To show this genuinely composes with everything earlier in the series, here’s how MCP slots into the supervisor pattern from Part 4 : multiple specialist agents, each backed by its own dedicated MCP server.

The takeaway: MCP doesn’t introduce a competing architecture to multi-agent systems. A specialist agent backed by an MCP server is structurally identical to a specialist agent backed by hand-written tools , the supervisor still routes the same way, the specialists still report back the same way. MCP simply changes where the tool definitions live and how many applications can reuse them.

This extends the keyword cards from Parts 1–5 with MCP-specific terms.

MCP Architecture Keywords Host — the AI application the user interacts with (your LangGraph agent). Client — lives inside the host, one per connected server, manages a 1:1 session. Server — the standalone program exposing tools, resources, and prompts. Doesn't know which host is connecting. JSON-RPC 2.0 — the message format underneath every MCP exchange: Requests, Responses, Notifications.

The Three Primitives Tool — an executable action, model decides when to call it. Built with @mcp.tool. Resource — read-only data, addressed by URI, application/user-controlled. Built with @mcp.resource("uri://..."). Prompt — a reusable instruction template, user-invoked. Built with @mcp.prompt.

FastMCP Server Keywords FastMCP(name="...") — creates the server instance, the container for all tools/resources/prompts. @mcp.tool — decorator that exposes a Python function as a callable tool; reads type hints for schema, docstring for description. @mcp.resource("scheme://{param}") — decorator for read-only data; {param} placeholders create a resource template. @mcp.prompt — decorator for a reusable prompt template. mcp.run() — starts the server. Defaults to stdio; pass transport="streamable-http" for networked deployment. fastmcp dev server.py — launches the MCP Inspector for manual testing without an LLM.

Transport Keywords stdio — subprocess-based, local, same-machine. The host launches and manages the server process. Never write debug output to stdout. streamable-http — networked, remote-capable, supports many concurrent host connections. The production default for shared servers.

LangGraph Integration Keywords MultiServerMCPClient({...}) — from langchain_mcp_adapters.client. Connects to one or more MCP servers, mixing transports freely in one config dict. await client.get_tools() — discovers and converts every tool from every configured server into LangChain-compatible BaseTool objects. tools_condition — (from Part 1) the pre-built LangGraph router; works identically whether tools are hand-written or MCP-sourced. Everything is async — client.get_tools(), graph building, and graph.ainvoke() all require await, because MCP calls are network/subprocess round-trips even over stdio.

You’re building a single tool for a single agent in a single codebase → You don’t need MCP. A plain @tool function from Part 1 is simpler and has zero protocol overhead.

You’re building a tool that should be reusable across multiple agents, multiple projects, or shared with teammates using different frameworks → Build it as an MCP server. This is the core value proposition.

You want to use one of the hundreds of pre-built community MCP servers (GitHub, Slack, Postgres, Google Drive, Stripe, and more) instead of writing integration code yourself → Connect via MultiServerMCPClient and skip writing that integration entirely. This is often the single biggest time-saver MCP offers.

Your tool needs to run on a different machine from your agent, or be shared by multiple host applications at once → Use Streamable HTTP transport, deployed as a standalone service.

Your tool only ever runs locally, alongside one agent, for development or personal use → stdio transport is sufficient, and simpler to set up.

The instinct when learning a new protocol is to wonder whether it replaces what you already know. It doesn’t. MCP doesn’t replace LangGraph’s tools, nodes, or graphs — it replaces where those tools come from and how many places can reuse them. Once an MCP tool reaches your ToolNode, it behaves exactly like the hand-written tools from Part 1, gets approved by the same interrupt() patterns from Part 3, and slots into the same supervisor architectures from Part 4.

The progression in this article followed the same staircase as every other part of this series: concept and roles (Level 1), the three primitives (Level 2), a working server (Level 3), the transport choice (Level 4), the LangGraph bridge (Level 5), production hardening (Level 6), and composition with multi-agent systems (Level 7). Each level added exactly one idea on top of a stable foundation.

You now have six articles’ worth of production scaffold: canonical structure (Part 1), memory management (Part 2), human-in-the-loop safety (Part 3), multi-agent orchestration (Part 4), real-world knowledge via RAG (Part 5), and standardized, shareable tooling via MCP (Part 6). Between them, these cover the overwhelming majority of what a serious, production-grade LangGraph application needs , and MCP is what lets the tooling half of that scale beyond any single codebase.

*For other parts of the series : Part 0 , Part 1 , Part 2 , Part 3 , Part 4 , *Part 5

MCP for LangGraph Developers: From Basics to Production was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

source & further reading

pub.towardsai.net — original article Amazon AI search generates images from text descriptions The Truth About Huge LLMs Context Windows Thinking Tokens Are Not Free. Most Pipelines Treat Them Like They Are.

MCP for LangGraph Developers: From Basics to Production

Run your AI side-project on zahid.host