Your team's LLM-powered application talks to a search index through one custom integration, a code repository through another, a Postgres database through a chain of LangChain tools, and a file system through raw Python I/O calls. Every new data source means writing a new integration. Every integration uses a different authentication model and returns data in a different shape. The LLM application is tightly coupled to every backend it touches, and swapping one out requires changing the application code directly.
The Model Context Protocol (MCP) exists to replace this bespoke plumbing with a single, standardized interface. Think of it as a USB-C port for LLM applications: one connector shape, one protocol, and any compatible server can plug into any compatible client without custom wiring.
LLM-powered tools have exploded in capability over the past two years, but the integration story has not kept up. Each AI application (IDE assistant, chat client, agent framework) historically built its own connectors for databases, APIs, document stores, and code repositories. There was no shared contract. If you wanted to use a specific code search tool with two different AI assistants, you needed two separate integrations.
MCP borrows its design philosophy from the Language Server Protocol (LSP), which standardized how code editors talk to language analyzers. Before LSP, each editor had its own plugin for each language. After LSP, one language server worked with every editor. MCP aims to do the same for AI tools and the data sources they need.
The protocol is an open standard, originally created at Anthropic and published under the MIT license. The specification reached stable at version 2025-11-25, and the Python SDK (mcp
on PyPI) is at 1.27.2 as of May 2026. A 2.0.0 alpha was published in June 2026 with an updated transport layer.
MCP uses JSON-RPC 2.0 as its message format. A client (the AI application) connects to a server (a service that provides context) over one of three transport types:
Here is the conceptual architecture:
flowchart LR
subgraph Client["Client (AI App)"]
A["Host<br/>IDE / Chat / Agent"]
B["MCP Client<br/>Protocol handler"]
end
subgraph Server["MCP Server"]
C["MCP Server<br/>Protocol handler"]
D["Resources<br/>context data"]
E["Tools<br/>executable functions"]
F["Prompts<br/>templated workflows"]
end
A <--> B
B <-->|JSON-RPC 2.0<br/>stdio / SSE / HTTP| C
C --> D
C --> E
C --> F
Every MCP session begins with a capability negotiation handshake. The client announces what features it supports (sampling, roots, elicitation). The server announces what features it offers (resources, tools, prompts). Both sides agree on a feature set before any data exchange happens.
Servers offer three main categories of functionality:
Resources expose data to the LLM. Think of them as GET endpoints in a REST API. A resource has a URI and returns content in a structured format. Example: file:///logs/2026-06-01.txt
returns the content of that log file. Resources are how the LLM loads context.
Tools are functions the LLM can invoke. Think of them as POST endpoints. A tool has a name, a description, and an input schema (JSON Schema). The LLM can call a tool to execute code, query a database, or trigger an external action. Unlike resources, tools are invoked on demand.
Prompts are reusable templates for LLM interactions. A prompt defines a message template with parameter slots. The client can populate the template and present the result to the user as a pre-built interaction.
Clients can also offer features to servers:
The mcp
package (v1.27.2) provides a high-level API called FastMCP that makes building a server straightforward. Here is a complete server that exposes a weather tool and a greeting resource:
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("Weather Demo")
@mcp.tool()
def get_weather(city: str, units: str = "celsius") -> str:
"""Get the current weather for a city."""
return f"Weather in {city}: 22 degrees {units}, partly cloudy"
@mcp.resource("city://{name}")
def city_info(name: str) -> str:
"""Get information about a city."""
cities = {
"dubai": "Dubai, UAE. Population: 3.6M. Timezone: UTC+4.",
"london": "London, UK. Population: 8.9M. Timezone: UTC+0.",
"tokyo": "Tokyo, Japan. Population: 14M. Timezone: UTC+9.",
}
return cities.get(name.lower(), f"City '{name}' not found.")
@mcp.prompt()
def travel_planning(city: str) -> str:
"""Generate a travel planning prompt for a destination."""
return (
f"You are a travel assistant helping someone plan a trip to {city}. "
f"Provide practical advice on weather, transportation, and attractions."
)
if __name__ == "__main__":
mcp.run()
Install it and run:
pip install "mcp[cli]"
python weather_server.py
The server starts on stdio by default. For HTTP transport, change the last line:
mcp.run(transport="streamable-http")
The official MCP Inspector is a browser-based tool for testing servers:
npx -y @modelcontextprotocol/inspector
Point it at your server endpoint (or stdio command) and you can browse resources, invoke tools, and inspect messages without writing a client.
| Feature | MCP | Custom API / REST | LangChain Tools | OpenAI function calling |
|---|---|---|---|---|
| Standardized protocol | Yes | No | No (framework-specific) | No (API-specific) |
| Primitive types | Resources, Tools, Prompts | Endpoints only | Tools only | Functions only |
| Transport options | stdio, SSE, Streamable HTTP | HTTP only | In-process only | HTTP only |
| Bidirectional | Yes (sampling, roots) | Request-response only | Request-response only | Request-response only |
| Auth model | OAuth 2.1 (spec), pluggable | Custom per API | Custom per integration | API key |
| Client independence | Any MCP client | One client per API | LangChain only | OpenAI only |
The main differentiator is client independence. A server written for MCP works with any MCP-compatible client: Claude Code, Claude Desktop, the Continue.dev VS Code extension, or a custom agent framework. Custom APIs and framework-specific tools lock you into one ecosystem.
Thinking tools are free. Tools execute arbitrary code on your server. Every tool invocation consumes compute and may have side effects. The LLM cannot distinguish between a cheap operation (reading a config file) and an expensive one (running a 100-row batch query). Set usage limits or implement a permission layer for destructive operations.
Resource URIs must be meaningful. A resource URI is not just a label -- it is the identifier the LLM uses to request data. Using opaque URIs (resource://abc123
) makes it impossible for the LLM to discover resources. Use hierarchical, descriptive URIs that hint at the content structure, like docs://project/api/reference
or db://customers/orders?status=pending
.
Forgetting the capability handshake. If you add a new tool to an existing server and your client does not re-negotiate capabilities, the client will not know the tool exists. The capability exchange happens at connection time. Restart both sides after changing what a server offers.
Over a server. An MCP server that exposes 50 tools and 200 resources becomes as hard to navigate as a REST API with 50 endpoints. Group related functionality into separate servers and let the client connect to multiple servers. Claude Desktop and other hosts already support multi-server setups.
Assuming tools are always available to the LLM. Tool invocation requires user consent in most host applications. The user must approve each tool call. Design your tools to be meaningful in a single invocation, because multi-step approval flows create a poor user experience.
MCP is the wrong choice if:
requests
directly.mcp
(v1.27.2) provides FastMCP, a decorator-based API for building servers in a few lines of code.npx @modelcontextprotocol/inspector
) to test servers without writing a client.Next post: building a multi-server MCP setup that connects a code search service, a documentation index, and a database gateway into a single AI assistant, with practical trade-offs on transport selection and auth.