{"slug": "how-to-build-and-serve-mcp-servers-without-effort", "title": "How to build and serve MCP servers without effort", "summary": "Flama, a Python web framework, now offers native support for building Model Context Protocol (MCP) servers, enabling AI agents to call functions, read data, and use prompt templates through simple decorators on Python functions. The framework implements the stateless 2026-07-28 revision of MCP, allowing developers to expose tools, resources, and prompts to any MCP-capable client without per-client state, simplifying horizontal scaling.", "body_md": "##### Publication\n\n##### Reading Time\n\nBuilding an MCP Server with Flama\n\nServing a model is only half the story. The other half is giving AI agents access to your world: the functions they can\ncall, the data they can read, and the prompt templates they can reuse. The **Model Context Protocol** (MCP) is the open\nstandard for exactly that, and Flama provides native, first-class support for building MCP servers with nothing more than\na few decorators on plain Python functions.\n\nIn this post, we walk through building a complete MCP server with Flama. We will expose tools, resources, and prompts to any MCP-capable client, and we will explore the advanced extensions for background tasks, interactive input, and embedded user interfaces. By the end, you will have a running server that any AI assistant can discover and call.\n\nBefore we dive into the details, we recommend you to have the following resources at hand:\n\n- Official Flama documentation:\n[Flama documentation](https://flama.dev/docs/) - Model Context Protocol page:\n[MCP docs](https://flama.dev/docs/generative-ai/model-context-protocol/) - Flama GitHub repository:\n[Flama on GitHub](https://github.com/vortico/flama)\n\nTable of contents\n\nWhat is MCP?\n\nThe **Model Context Protocol** is an open standard that lets AI applications connect to external capabilities through a\nuniform interface. An MCP server advertises three kinds of capability:\n\n**Tools**: functions the model can invoke.** Resources**: data the model can read, addressed by URI.** Prompts**: reusable prompt templates with arguments.\n\nClients (AI assistants, agent frameworks, IDEs) discover these capabilities and call them over\n[JSON-RPC](https://www.jsonrpc.org/), a lightweight remote-procedure-call protocol that exchanges JSON messages.\n\nFlama implements the **stateless 2026-07-28** revision of the protocol. Rather than negotiating a session through an\n\n`initialize`\n\nhandshake, every request is self-contained, carrying its protocol version and capabilities in a `_meta`\n\nobject and its routing data in `Mcp-Method`\n\n/ `Mcp-Name`\n\nheaders. This makes MCP servers trivial to scale horizontally,\nsince no per-client state is held between calls.**Why does this matter?**\n\n**Interoperability**: Any MCP-capable client can use your tools without bespoke integration code.** Reuse**: The same Python functions that power your API can be exposed to AI agents with a single decorator.** Type safety**: Flama derives each tool's input and output JSON Schema from the handler's type hints, so clients receive accurate, self-contained contracts.\n\nSetting up the project\n\nAll examples in this post assume Flama has been installed with the pydantic extras via [uv](https://docs.astral.sh/uv/):\n\n```\nuv pip install \"flama[pydantic]\"\n```\n\nAlternatively, you can run any command without a prior install by using `uvx --from \"flama[pydantic]\" flama ...`\n\n, but\nfor brevity we assume Flama is already installed throughout.\n\nRegistering an MCP server\n\nAn MCP server in Flama is a named registry that you mount on your application at a specific URL path. The `add_server`\n\nmethod both creates the server and mounts it, so a single application can host several independent servers:\n\n``` python\nimport flamafrom flama import Flamaapp = Flama(    openapi={        \"info\": {            \"title\": \"MCP Server API\",            \"version\": \"1.0.0\",            \"description\": \"A Model Context Protocol server built with Flama 🔥\",        },    },)app.mcp.add_server(\"/mcp/tools/\", \"tools\", version=\"2.0.0\", instructions=\"Flama demo MCP tools server\")\n```\n\nThis registers a server named `tools`\n\n, reachable at `/mcp/tools/`\n\n. The `version`\n\nparameter declares the server's\nsemantic version, and `instructions`\n\nprovides a human-readable description that clients can display. With the server in\nplace, you populate it by name: every tool, resource, and prompt decorator takes an `mcp`\n\nargument identifying which\nserver the capability belongs to.\n\nExposing tools\n\nA **tool** is a function the model can invoke. Declare one with the `tool`\n\ndecorator, pointing it at the target server\nthrough the `mcp`\n\nargument. Flama infers the tool's input and output schema from the handler's type hints:\n\n``` python\n@app.mcp.tool(\"add\", description=\"Add two integers\", mcp=\"tools\")def add(a: int, b: int) -> int:    return a + b\n```\n\nTools may be synchronous or asynchronous. When you omit the name, the function's own name is used; when you omit the\ndescription, its docstring is used instead. The parameters and return annotation become the tool's `inputSchema`\n\nand\n`outputSchema`\n\n, advertised to clients verbatim.\n\nHere is an asynchronous tool that returns a string:\n\n```\n@app.mcp.tool(\"greet\", description=\"Greet someone by name\", mcp=\"tools\")async def greet(name: str) -> str:    return f\"Hello, {name}!\"\n```\n\nLet us verify the tool works. Start the application:\n\n```\nflama run app:app\n```\n\nAnd call it with `curl`\n\n:\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/tools/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: tools/call' \\  -H 'Mcp-Name: add' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"add\",\"arguments\":{\"a\":2,\"b\":3}}}'\n```\n\nThe server responds with a JSON-RPC result:\n\n```\n{  \"jsonrpc\": \"2.0\",  \"id\": 1,  \"result\": {    \"content\": [{\"type\": \"text\", \"text\": \"5\"}],    \"structuredContent\": 5  }}\n```\n\nThe `structuredContent`\n\nfield carries the typed return value, while `content`\n\nprovides a text representation for clients\nthat prefer unstructured output.\n\nExposing resources\n\nA **resource** is readable data addressed by a URI. The `resource`\n\ndecorator registers one on the named server:\n\n``` python\nimport json@app.mcp.resource(\"config://app\", name=\"config\", description=\"Application configuration\",                  mime_type=\"application/json\", mcp=\"tools\")def config():    return json.dumps({\"debug\": True, \"name\": \"flama-mcp\"})\n```\n\nResources are listed and read by their URI, so a client fetches the configuration above by requesting `config://app`\n\n.\nThe MIME type tells the client how to interpret the content.\n\nTo read the resource:\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/tools/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: resources/read' \\  -H 'Mcp-Name: config://app' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"resources/read\",\"params\":{\"uri\":\"config://app\"}}'\n{  \"jsonrpc\": \"2.0\",  \"id\": 1,  \"result\": {    \"contents\": [      {        \"uri\": \"config://app\",        \"mimeType\": \"application/json\",        \"text\": \"{\\\"debug\\\": true, \\\"name\\\": \\\"flama-mcp\\\"}\"      }    ]  }}\n```\n\nExposing prompts\n\nA **prompt** is a named, reusable prompt template. The `prompt`\n\ndecorator registers one on the named server, deriving\nits arguments from the handler's parameters:\n\n```\n@app.mcp.prompt(\"summarise\", description=\"Summarise the given text\", mcp=\"tools\")def summarise(text: str):    return f\"Summarise the following:\\n\\n{text}\"\n```\n\nPrompts are listed by name and rendered with arguments supplied by the client. Here `text`\n\nbecomes the single required\nargument. To get the rendered prompt:\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/tools/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: prompts/get' \\  -H 'Mcp-Name: summarise' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"prompts/get\",\"params\":{\"name\":\"summarise\",\"arguments\":{\"text\":\"Flama is great\"}}}'\n{  \"jsonrpc\": \"2.0\",  \"id\": 1,  \"result\": {    \"messages\": [      {        \"role\": \"user\",        \"content\": {\"type\": \"text\", \"text\": \"Summarise the following:\\n\\nFlama is great\"}      }    ]  }}\n```\n\nAdvanced extensions\n\nThe `2026-07-28`\n\nprotocol defines optional extensions, all supported natively by Flama. A server advertises the\nextensions it uses in its discovery capabilities, so clients negotiate them per request.\n\nBackground tasks\n\nLong-running tools can run as background **Tasks** rather than blocking the call. Pass `task=True`\n\nand the server\nreturns a task handle the client can poll:\n\n```\n@app.mcp.tool(\"square\", task=True, description=\"Square a number as a background task\", mcp=\"tools\")async def square(x: int) -> int:    return x * x\n```\n\nWhen a client calls `square`\n\n, the server may return the result directly for fast operations, or issue a task token for\ntruly long-running computations that the client can poll until completion.\n\nElicitation\n\nA tool can pause mid-call to **elicit** further input from the user. The handler declares a parameter annotated with\n`Elicitation`\n\nto read the answers gathered so far, and returns `Elicit.require(...)`\n\nto request more:\n\n``` python\nfrom flama.mcp.data_structures import Elicit, Elicitation@app.mcp.tool(\"confirm\", description=\"Confirm an action through an elicitation round-trip\", mcp=\"tools\")def confirm(elicitation: Elicitation) -> str:    if \"confirm\" not in elicitation:        return Elicit.require(\"Are you sure?\", {\"type\": \"boolean\"}, name=\"confirm\")    return f\"confirmed={elicitation['confirm']}\"\n```\n\nThe `elicitation`\n\nparameter is supplied by the server and excluded from the tool's input schema, so it never appears as\na tool argument the client must fill. Because the protocol is stateless, the answers gathered so far are round-tripped\nthrough an opaque continuation token the client echoes back on the retry.\n\nWhen the client calls `confirm`\n\nwithout prior answers, it receives a response with `resultType: \"inputRequired\"`\n\nand a\nschema describing what the server needs. The client collects that input from the user and retries, this time carrying\nthe gathered answers.\n\nMCP Apps\n\nA tool can declare a prefetchable user-interface template (an **MCP App**) that hosts render alongside its result.\nRegister the template with `app_template`\n\nand point the tool at it with `ui_template`\n\n:\n\n```\n@app.mcp.app_template(\"ui://widget\", name=\"widget\", description=\"A small UI widget\", mcp=\"tools\")def widget():    return \"<html><body><h1>Flama widget</h1></body></html>\"@app.mcp.tool(\"with_ui\", description=\"A tool that declares a prefetchable UI template\",              ui_template=\"ui://widget\", mcp=\"tools\")def with_ui() -> str:    return \"rendered\"\n```\n\nClients that support MCP Apps can prefetch the template and render it alongside the tool's result, providing a richer interactive experience.\n\nMultiple servers in one application\n\nA single Flama application can host as many MCP servers as you need, each under its own path. This is useful for separating concerns or versioning different sets of capabilities:\n\n```\napp.mcp.add_server(\"/mcp/tools/\", \"tools\", version=\"2.0.0\", instructions=\"Flama demo MCP tools server\")app.mcp.add_server(\"/mcp/math/\", \"math\", version=\"2.0.0\")\n```\n\nEach server is independent. Tools, resources, and prompts are bound to their server by the `mcp`\n\nargument:\n\n```\n@app.mcp.tool(\"multiply\", description=\"Multiply two integers\", mcp=\"math\")def multiply(a: int, b: int) -> int:    return a * b\n```\n\nA `tools/list`\n\nrequest to `/mcp/tools/`\n\nreturns only the tools registered on the `tools`\n\nserver, while a request to\n`/mcp/math/`\n\nreturns only `multiply`\n\n. Clients discover each server independently.\n\nThe complete application\n\nPutting it all together, here is the full application. It registers two MCP servers on a single Flama app, populates them with tools (sync, async, background task, elicitation, UI template), a resource, and a prompt:\n\n``` python\nimport jsonimport flamafrom flama import Flamafrom flama.mcp.data_structures import Elicit, Elicitationapp = Flama(    openapi={        \"info\": {            \"title\": \"MCP Server API\",            \"version\": \"1.0.0\",            \"description\": \"A Model Context Protocol server built with Flama 🔥\",        },    },)app.mcp.add_server(\"/mcp/tools/\", \"tools\", version=\"2.0.0\", instructions=\"Flama demo MCP tools server\")app.mcp.add_server(\"/mcp/math/\", \"math\", version=\"2.0.0\")@app.mcp.tool(\"add\", description=\"Add two integers\", mcp=\"tools\")def add(a: int, b: int) -> int:    return a + b@app.mcp.tool(\"greet\", description=\"Greet someone by name\", mcp=\"tools\")async def greet(name: str) -> str:    return f\"Hello, {name}!\"@app.mcp.tool(\"square\", task=True, description=\"Square a number as a background task\", mcp=\"tools\")async def square(x: int) -> int:    return x * x@app.mcp.tool(\"confirm\", description=\"Confirm an action through an elicitation round-trip\", mcp=\"tools\")def confirm(elicitation: Elicitation) -> str:    if \"confirm\" not in elicitation:        return Elicit.require(\"Are you sure?\", {\"type\": \"boolean\"}, name=\"confirm\")    return f\"confirmed={elicitation['confirm']}\"@app.mcp.resource(\"config://app\", name=\"config\", description=\"Application configuration\",                  mime_type=\"application/json\", mcp=\"tools\")def config():    return json.dumps({\"debug\": True, \"name\": \"flama-mcp\"})@app.mcp.prompt(\"summarise\", description=\"Summarise the given text\", mcp=\"tools\")def summarise(text: str):    return f\"Summarise the following:\\n\\n{text}\"@app.mcp.app_template(\"ui://widget\", name=\"widget\", description=\"A small UI widget\", mcp=\"tools\")def widget():    return \"<html><body><h1>Flama widget</h1></body></html>\"@app.mcp.tool(\"with_ui\", description=\"A tool that declares a prefetchable UI template\",              ui_template=\"ui://widget\", mcp=\"tools\")def with_ui() -> str:    return \"rendered\"@app.mcp.tool(\"multiply\", description=\"Multiply two integers\", mcp=\"math\")def multiply(a: int, b: int) -> int:    return a * bif __name__ == \"__main__\":    flama.run(flama_app=app, server_host=\"0.0.0.0\", server_port=8000)\n```\n\nSave this as `app.py`\n\nand run it:\n\n```\npython app.py\n```\n\nThe server starts on port 8000 with both MCP endpoints ready.\n\nTesting with curl\n\nOnce the application is running, you can exercise every capability from the command line.\n\n**List available tools** on the `tools`\n\nserver:\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/tools/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: tools/list' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\",\"params\":{}}'\n```\n\nThe response lists five tools (`add`\n\n, `confirm`\n\n, `greet`\n\n, `square`\n\n, `with_ui`\n\n), each with its full input and output\nschema derived from the Python type hints.\n\n**Call a tool** on the `math`\n\nserver:\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/math/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: tools/call' \\  -H 'Mcp-Name: multiply' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"multiply\",\"arguments\":{\"a\":4,\"b\":5}}}'\n{  \"jsonrpc\": \"2.0\",  \"id\": 1,  \"result\": {    \"content\": [{\"type\": \"text\", \"text\": \"20\"}],    \"structuredContent\": 20  }}\n```\n\n**Read a resource:**\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/tools/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: resources/read' \\  -H 'Mcp-Name: config://app' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"resources/read\",\"params\":{\"uri\":\"config://app\"}}'\n```\n\n**Get a rendered prompt:**\n\n```\ncurl -s -X POST http://127.0.0.1:8000/mcp/tools/ \\  -H 'Content-Type: application/json' \\  -H 'Mcp-Method: prompts/get' \\  -H 'Mcp-Name: summarise' \\  -H 'MCP-Protocol-Version: 2026-07-28' \\  -d '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"prompts/get\",\"params\":{\"name\":\"summarise\",\"arguments\":{\"text\":\"Flama is great\"}}}'\n```\n\nEvery request follows the same pattern: a `POST`\n\nto the server's path, with `Mcp-Method`\n\nidentifying the operation,\n`Mcp-Name`\n\nidentifying the target, and `MCP-Protocol-Version`\n\ndeclaring the protocol revision.\n\nConclusions\n\nFlama makes the journey from \"I have Python functions\" to \"AI agents can discover and call them\" as short as possible. The MCP support requires no configuration files, no code generation, and no external tooling. You write plain Python functions, decorate them, and the framework handles the rest:\n\n: Mount a named MCP server at any path.`add_server`\n\n: Expose a function as an invocable tool with full schema inference.`@tool`\n\n: Expose data at a URI for clients to read.`@resource`\n\n: Expose a reusable prompt template with typed arguments.`@prompt`\n\n**Extensions**: Background tasks, elicitation, and MCP Apps for richer interactions.\n\nBecause the protocol is stateless, your servers scale horizontally without sticky sessions. Because the schema is derived from type hints, clients receive accurate contracts without manual specification. And because multiple servers can live in a single application, you can organise capabilities by domain, version, or access level.\n\nIn upcoming posts, we will explore how to combine MCP servers with LLM serving to build fully autonomous agent architectures where the model and its tools live in the same application.\n\nReferences\n\nSupport our work\n\nIf you find Flama useful for building robust Machine Learning and Generative AI APIs, we'd be thrilled if you showed\nyour support by giving us a ⭐ on [GitHub](https://github.com/vortico/flama). Your stars are the best fuel for our\ndevelopment efforts!\n\nYou can also stay updated with the latest news and development threads by following us on\n[𝕏](https://x.com/VorticoTech).\n\nAbout the authors\n\n[Vortico](https://vortico.tech/): We specialize in software development, helping businesses enhance and expand their\nAI and technology capabilities.", "url": "https://wpnews.pro/news/how-to-build-and-serve-mcp-servers-without-effort", "canonical_source": "https://flama.dev/blog/building_an_mcp_server_with_flama/", "published_at": "2026-06-25 10:56:50+00:00", "updated_at": "2026-06-25 11:14:36.438834+00:00", "lang": "en", "topics": ["ai-tools", "developer-tools", "ai-agents", "large-language-models", "ai-infrastructure"], "entities": ["Flama", "Model Context Protocol", "MCP", "JSON-RPC", "uv"], "alternates": {"html": "https://wpnews.pro/news/how-to-build-and-serve-mcp-servers-without-effort", "markdown": "https://wpnews.pro/news/how-to-build-and-serve-mcp-servers-without-effort.md", "text": "https://wpnews.pro/news/how-to-build-and-serve-mcp-servers-without-effort.txt", "jsonld": "https://wpnews.pro/news/how-to-build-and-serve-mcp-servers-without-effort.jsonld"}}