# Tutorial: Python MCP Internal API > LLM

> Source: <https://www.devclubhouse.com/a/ship-an-mcp-server-in-python-that-exposes-your-internal-api-to-llms>
> Published: 2026-06-19 02:14:10+00:00

# Ship an MCP Server in Python That Exposes Your Internal API to LLMs

Wrap a corporate REST API in three typed tools using FastMCP, inspect them locally, and connect them to Claude Desktop—without ever exposing credentials to the model.

[Mariana Souza](https://www.devclubhouse.com/u/mariana_souza)

## What You'll Build

A Python MCP server using `FastMCP`

that wraps a corporate REST API as three structured tools—`search_customers`

, `get_order`

, and `create_support_ticket`

. Any MCP-compatible client (Claude Desktop, Cursor, custom agents) can call your API with full type safety, without the model ever seeing credentials or constructing raw URLs.

## Prerequisites

- Python 3.10+ (required for built-in generic types like
`list[dict]`

) `pip`

or`uv`

for package management- Node.js 18+ —
`mcp dev`

invokes`npx @modelcontextprotocol/inspector`

under the hood - Latest Claude Desktop (for end-to-end testing; optional if using only the inspector)
- A REST API with a bearer token — a mock URL works fine to follow along
- Comfortable with
`async`

/`await`

Python

## 1. Set Up the Project

```
mkdir mcp-internal-api && cd mcp-internal-api
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install "mcp[cli]" httpx python-dotenv
```

`mcp[cli]`

installs the `mcp`

CLI used for local inspection. `httpx`

handles async HTTP to your backend.

Create `.env`

for local credentials — **add it to .gitignore now**:

```
API_BASE_URL=https://api.corp.example.com
API_KEY=sk-your-real-token-here
```

## 2. Write the Server

Create `server.py`

:

``` python
import os
import httpx
from dotenv import load_dotenv
from mcp.server.fastmcp import FastMCP

load_dotenv()

mcp = FastMCP("internal-api")

_BASE = os.environ["API_BASE_URL"]
_KEY  = os.environ["API_KEY"]

def _auth_headers() -> dict[str, str]:
    return {"Authorization": f"Bearer {_KEY}", "Accept": "application/json"}

@mcp.tool()
async def search_customers(query: str, limit: int = 10) -> list[dict]:
    """Search customers by name or email. Returns a list of customer records."""
    async with httpx.AsyncClient() as client:
        r = await client.get(
            f"{_BASE}/customers",
            headers=_auth_headers(),
            params={"q": query, "limit": limit},
            timeout=10.0,
        )
        r.raise_for_status()
        return r.json()

@mcp.tool()
async def get_order(order_id: str) -> dict:
    """Fetch a single order by its ID."""
    async with httpx.AsyncClient() as client:
        r = await client.get(
            f"{_BASE}/orders/{order_id}",
            headers=_auth_headers(),
            timeout=10.0,
        )
        r.raise_for_status()
        return r.json()

@mcp.tool()
async def create_support_ticket(
    customer_id: str,
    subject: str,
    body: str,
    priority: str = "normal",
) -> dict:
    """Open a support ticket for a customer.

    Args:
        customer_id: The customer's UUID.
        subject: One-line summary (max 120 chars).
        body: Full description of the issue.
        priority: 'low', 'normal', or 'high'.
    """
    if priority not in {"low", "normal", "high"}:
        raise ValueError(f"priority must be low/normal/high, got '{priority}'")

    async with httpx.AsyncClient() as client:
        r = await client.post(
            f"{_BASE}/tickets",
            headers=_auth_headers(),
            json={
                "customer_id": customer_id,
                "subject": subject,
                "body": body,
                "priority": priority,
            },
            timeout=10.0,
        )
        r.raise_for_status()
        return r.json()

if __name__ == "__main__":
    mcp.run()
```

**Why each decision matters:**

| Detail | Reason |
|---|---|
| Type annotations | `FastMCP` auto-generates JSON Schema from them — the LLM receives exact parameter types, not free-form text |
| Docstrings | Become the tool description the model reads before calling; write them like an API spec |
`raise_for_status()` + `ValueError` |
Exceptions surface to the LLM as structured tool errors rather than crashing the server process |
| Credentials in env vars | Never passed as tool arguments, never echoed in responses, never in source control |

`mcp.run()`

defaults to **stdio transport**, which is what Claude Desktop and most local clients expect — the client spawns your server as a subprocess and talks JSON-RPC over stdin/stdout.

## 3. Inspect Locally with `mcp dev`

Before touching any LLM, validate the wiring in a browser UI:

```
mcp dev server.py
```

This starts your server and opens the MCP Inspector (the URL is printed in your terminal). Navigate to **Tools** — you'll see all three tools with auto-generated input forms matching your Python signatures. Call `search_customers`

with `query = "alice"`

and confirm a JSON response or a typed upstream error.

Tip:Set`API_BASE_URL=https://httpbin.org`

temporarily to exercise the async/auth plumbing without a live internal API. You'll get a 404 back, which correctly surfaces as an`httpx.HTTPStatusError`

tool error.

## 4. Wire to Claude Desktop

Locate the config file:

| OS | Path |
|---|---|
| macOS | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| Windows | `%APPDATA%\Claude\claude_desktop_config.json` |

Add your server entry. **Use absolute paths** — Claude Desktop spawns a clean, non-login shell that won't activate your virtualenv:

[Serverless Inference by DigitalOcean 55+ models, every modality. One API key, one bill.](https://www.devclubhouse.com/go/ad/13)

```
{
  "mcpServers": {
    "internal-api": {
      "command": "/absolute/path/to/.venv/bin/python",
      "args": ["/absolute/path/to/server.py"],
      "env": {
        "API_BASE_URL": "https://api.corp.example.com",
        "API_KEY": "sk-your-real-token-here"
      }
    }
  }
}
```

Restart Claude Desktop, then in a new conversation:

"Search for customers named 'smith', then open a high-priority support ticket for the first result explaining their order is delayed."

Claude will call `search_customers`

, inspect the output, then call `create_support_ticket`

— tool calls appear inline in the UI with their arguments and responses visible.

## Verify It Works

**Inspector:** After `mcp dev server.py`

, the Tools tab lists all three tools with correct schemas and no import errors in the terminal.

**Claude Desktop:** Open **Settings → Developer**. Your server appears as `internal-api`

with a green connected indicator. If it shows an error state, restart Claude Desktop after editing the config.

**Schema sanity-check** — confirm FastMCP generated correct schemas without starting a full client:

``` python
python -c "
import os; os.environ['API_BASE_URL']='http://x'; os.environ['API_KEY']='x'
import asyncio
import server
async def main():
    for tool in await server.mcp.list_tools():
        print(tool.name, tool.inputSchema)
asyncio.run(main())
"
```

You should see each tool name alongside its JSON Schema `inputSchema`

dict.

## Troubleshooting

** ModuleNotFoundError: No module named 'mcp'** — Claude Desktop uses a clean shell; your virtualenv isn't activated. Confirm

`"command"`

points to the venv interpreter: `/path/to/.venv/bin/python`

, not the system `python`

.**Tools don't appear in Claude Desktop** — Run `mcp dev server.py`

first; import errors or missing env vars appear there immediately. Also check `~/Library/Logs/Claude/`

on macOS — Claude Desktop writes a per-server log file named after your server key (`internal-api`

).

** KeyError: 'API_BASE_URL'** — The

`env`

block in `claude_desktop_config.json`

replaces the shell environment entirely; `load_dotenv()`

won't read your `.env`

from there. Set all required keys explicitly in the JSON config.** httpx.ReadTimeout** — Your backend is slow. Raise

`timeout=30.0`

, or restructure long-running operations to return an `AsyncGenerator`

and use `yield`

to stream partial results back to the client.## Next Steps

**Resources:** Expose read-only context (OpenAPI specs, internal wikis) via`@mcp.resource()`

so the LLM can pull reference material without consuming tool-call budget.**HTTP/SSE transport:** For multi-user or remote deployments, replace`mcp.run()`

with`mcp.run(transport="sse")`

and mount it behind a secured reverse proxy; validate per-request tokens in middleware rather than a static env var.**Rate limiting:** Wrap`_auth_headers()`

with a token-bucket limiter (`aiolimiter`

is async-native) to prevent an agentic loop from flooding your upstream API.**Richer schemas:** Replace`dict`

return types with Pydantic models —`FastMCP`

generates detailed JSON Schema from them, giving the model better guidance on what fields to expect and use.**Spec & SDK:**[modelcontextprotocol.io/docs](https://modelcontextprotocol.io/docs)and the[Python SDK on GitHub](https://github.com/modelcontextprotocol/python-sdk).

[Mariana Souza](https://www.devclubhouse.com/u/mariana_souza)· Senior Editor

Mariana covers the fast-moving world of machine learning and generative AI, with a particular focus on how these technologies are reshaping development workflows. When she isn't stress-testing the latest foundation models, she's usually at a local hackathon.

## Discussion 0

No comments yet

Be the first to weigh in.