# Building an AI Chat Agent with MCP, Spring AI

> Source: <https://dev.to/ykpraveen/building-an-ai-chat-agent-with-mcp-spring-ai-f0n>
> Published: 2026-06-24 09:41:20+00:00

Model Context Protocol (MCP) is an open standard for connecting AI apps to tools and data sources. A useful way to think about it is as a USB-C port for AI: one standard interface that lets different models plug into different capabilities without custom glue code for every integration.

In this project, we combine MCP, Spring AI, and Google Gemini to build a chat app that can answer weather questions using real tools instead of hallucinating. The system has three parts:

The result is a small but realistic architecture you can extend into a production assistant.

```
User (Browser:3000)
    | POST /api/chat
    v
AI Agent (Spring:7171) -- MCP / Streamable HTTP --> MCP Server (Spring:7170)
    |                                               |
    | Google Gemini                                 | Bright Sky API (weather)
    |                                               | OpenStreetMap Nominatim (geocoding)
    v                                               v
Chat response                                    Tool execution
```

The full source code is available on [GitHub](https://github.com/ykpraveen/mcp-spring-sample).

The tool server is a Spring Boot application that exposes MCP tools through Spring AI's annotation scanner. It runs on port `7170`

and uses Streamable HTTP for transport.

```
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-mcp-server-webmvc</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>
```

With Spring AI, a tool is just a Spring bean method annotated with `@McpTool`

:

```
@Component
public class WeatherTool {

    private final WeatherToolService weatherToolService;

    public WeatherTool(WeatherToolService weatherToolService) {
        this.weatherToolService = weatherToolService;
    }

    @McpTool(name = "get_current_weather",
             description = "Get current weather by dwd_station_id or by lat/lon")
    public Map<String, Object> getCurrentWeather(
            @McpToolParam(description = "DWD station ID", required = false)
            String dwd_station_id,
            @McpToolParam(description = "Latitude", required = false) Double lat,
            @McpToolParam(description = "Longitude", required = false) Double lon
    ) {
        return weatherToolService.getWeather(dwd_station_id, lat, lon);
    }
}
```

Spring turns that method into an MCP tool definition and publishes the parameter metadata as part of the schema. That means the model can discover the tool, understand its inputs, and decide when to call it.

The project also includes a geocoding tool that resolves city names to coordinates:

```
@McpTool(name = "geocode_city",
         description = "Convert a city name to latitude and longitude using OpenStreetMap Nominatim")
public Map<String, Object> geocodeCity(
        @McpToolParam(description = "City name (e.g., 'Berlin', 'New York')", required = true)
        String cityName
) { ... }
```

The tools delegate the real work to services that handle validation, caching, and external API calls:

```
@Service
public class WeatherToolService {

    public Map<String, Object> getWeather(String dwdStationId, Double lat, Double lon) {
        // Validate the request
        // Check the cache
        // Call Bright Sky if needed
        // Return a structured response
    }
}
```

The key design choices are straightforward:

`success`

, `error_code`

, and `error_message`

```
server:
  port: 7170

spring:
  ai:
    mcp:
      server:
        name: spring-sample-mcp-server
        version: 1.0.0
        protocol: STREAMABLE
        type: SYNC
        annotation-scanner:
          enabled: true

mcp:
  security:
    api-key: ${MCP_API_KEY:}
```

The `STREAMABLE`

protocol gives the agent a lightweight MCP transport, and the shared API key keeps the demo simple without adding full auth infrastructure.

The MCP server and agent share an `MCP_API_KEY`

. The agent adds it automatically as an `X-API-Key`

header, and the server validates it on inbound MCP requests.

That is enough for local development and a sample project. For anything public-facing, move to Spring Security, OAuth2 or JWT, rate limiting, and a gateway in front of the MCP endpoint.

The agent is responsible for deciding when to use tools, calling Gemini, and keeping the conversation stateful.

```
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-google-genai</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-mcp-client</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>
```

The agent injects the shared API key through a custom HTTP request customizer:

```
@Configuration
public class AgentConfiguration {

    @Bean
    McpClientCustomizer<HttpClientStreamableHttpTransport.Builder>
    streamableHttpTransportCustomizer(AgentProperties properties) {
        McpSyncHttpClientRequestCustomizer requestCustomizer =
                (builder, method, uri, body, context) -> {
                    if (StringUtils.hasText(properties.getMcpApiKey())) {
                        builder.header("X-API-Key", properties.getMcpApiKey());
                    }
                };
        return (name, builder) -> builder.httpRequestCustomizer(requestCustomizer);
    }
}
```

The agent keeps a small in-memory conversation history, checks whether the user message looks like a tool request, and then routes the prompt through either a plain Gemini client or a tool-enabled client.

```
public String reply(String sessionId, String userMessage) {
    List<ConversationTurn> history = memoryStore.history(sessionId);
    String prompt = buildPrompt(history, userMessage);
    boolean toolRequest = shouldUseTools(userMessage);
    ChatClient client = toolRequest ? toolEnabledClient() : plainChatClient;
    String answer = invokeModel(client, prompt);
    memoryStore.appendTurn(sessionId, userMessage, answer);
    return answer;
}
```

The lazy initialization is deliberate: the agent can start even if the MCP server is down, and it only initializes MCP clients when a tool request actually arrives.

The tool trigger is intentionally simple:

```
private static boolean shouldUseTools(String userMessage) {
    String normalized = userMessage.toLowerCase(Locale.ROOT);
    for (String keyword : TOOL_KEYWORDS) {
        if (normalized.contains(keyword)) {
            return true;
        }
    }
    return false;
}
```

That heuristic is enough for a demo and easy to explain. In a larger system, you could replace it with a router model or intent classifier.

The model call runs on a virtual thread with a configurable timeout so the request does not hang forever if Gemini is slow or unreachable:

``` js
private String invokeModel(ChatClient client, String prompt) {
    var executor = Executors.newVirtualThreadPerTaskExecutor();
    try {
        var future = executor.submit(() ->
                client.prompt().user(prompt).call().content());
        return future.get(timeoutSeconds, TimeUnit.SECONDS);
    } catch (TimeoutException ex) {
        throw new ResponseStatusException(HttpStatus.GATEWAY_TIMEOUT, ...);
    } finally {
        executor.shutdownNow();
    }
}
```

Conversation history lives in an in-memory LRU store with a small per-session turn window. That keeps follow-up questions like "What about tomorrow?" grounded in the earlier exchange without introducing a database too early.

The agent configuration sets the model to `gemini-3.5-flash`

, the memory limit to 20 turns per session, and the session cap to 500.

The frontend is a Vite app with a simple chat window, minimal state, and no component library.

```
const [messages, setMessages] = useState([]);
const [loading, setLoading] = useState(false);

const sendMessage = async (text) => {
    setMessages(prev => [...prev, { role: 'user', content: text }]);
    setLoading(true);
    const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ sessionId, message: text })
    });
    const data = await response.json();
    setMessages(prev => [...prev, {
        role: 'assistant',
        content: data.reply || 'No response'
    }]);
    setLoading(false);
};
```

The Vite dev server proxies `/api/*`

to the agent:

```
proxy: {
  '/api': {
    target: 'http://localhost:7171',
    changeOrigin: true,
    rewrite: (path) => path.replace(/^\/api/, '')
  }
}
```

The UI is intentionally plain: a purple gradient, responsive layout, and a smooth message list are enough to make the app feel complete without distracting from the architecture.

```
export GEMINI_API_KEY=your_gemini_api_key
export MCP_API_KEY=a_shared_secret
cd mcp-server-spring
mvn spring-boot:run
cd mcp-spring-agent
mvn spring-boot:run
cd mcp-ui
npm install
npm run dev
```

If the user asks, "What's the weather in Berlin?" the flow looks like this:

`geocode_city("Berlin")`

to get coordinates`get_current_weather(lat=52.52, lon=13.41)`

**MCP separates the model from the tools.** The agent knows what tools exist and how to call them, but not how those tools are implemented. That makes the system easier to evolve.

**The same server can serve different models.** Gemini is just the model in this demo. The MCP server itself can work with any compatible client.

**Lazy initialization keeps the app resilient.** The agent can boot even if the MCP server is temporarily unavailable, and tool support only activates when it is actually needed.

This sample is a solid starting point. Natural next steps include:

*Have you built anything with MCP and Spring AI? I'd love to hear how you approached it.*
