cd /news/large-language-models/solon-4-0-chatmodel-a-practical-guid… Β· home β€Ί topics β€Ί large-language-models β€Ί article
[ARTICLE Β· art-47504] src=dev.to β†— pub= topic=large-language-models verified=true sentiment=↑ positive

Solon 4.0 ChatModel: A Practical Guide to Building LLM-Powered Applications

Solon 4.0 introduces ChatModel, a unified LLM client that abstracts away boilerplate code for integrating large language models into Java applications. The API supports multiple model providers including OpenAI, Ollama, Gemini, Anthropic, and DashScope through a dialect pattern, and offers both synchronous and streaming chat capabilities with a builder-oriented API.

read5 min views1 publishedJul 4, 2026

If you've ever tried integrating a large language model (LLM) into a Java application, you've probably written a lot of boilerplate: HTTP clients, JSON parsing, streaming handling, session management. Solon 4.0's ChatModel

abstracts all of that away with a clean, builder-oriented API.

In this guide, I'll walk through building real, working AI features using ChatModel

β€” from a simple chat call to a streaming chatbot with conversation memory.

ChatModel

(package org.noear.solon.ai.chat

) is a unified LLM client in Solon's AI ecosystem. Instead of writing raw HTTP calls for different model providers, you use a single API that supports:

Flux<ChatResponse>

)The best part? It uses a dialect pattern β€” you point it at any compatible LLM endpoint, and it adapts automatically.

Add the dependency to your pom.xml

(no parent POM needed β€” Solon works standalone):

<dependency>
    <groupId>org.noear</groupId>
    <artifactId>solon-ai</artifactId>
    <version>${solon.version}</version>
</dependency>

This pulls in all built-in dialects (OpenAI, Ollama, Gemini, Anthropic, DashScope).

solon.ai.chat:
  demo:
    apiUrl: "http://127.0.0.1:11434/api/chat"   # Full URL, not baseUrl
    provider: "ollama"                           # Dialect identifier
    model: "llama3.2"                            # Model name
    headers:
      x-demo: "demo1"

Then create a @Bean

to get a ready-to-use ChatModel

:

import org.noear.solon.ai.chat.ChatConfig;
import org.noear.solon.ai.chat.ChatModel;
import org.noear.solon.annotation.Bean;
import org.noear.solon.annotation.Configuration;
import org.noear.solon.annotation.Inject;

@Configuration
public class AiConfig {
    @Bean
    public ChatModel chatModel(@Inject("${solon.ai.chat.demo}") ChatConfig config) {
        return ChatModel.of(config).build();
    }
}

Prefer code over config? Use the builder directly:

@Bean
public ChatModel chatModel() {
    return ChatModel.of("http://127.0.0.1:11434/api/chat")
            .standard("ollama")      // or .provider("ollama") pre-4.0
            .model("llama3.2")
            .timeout(Duration.ofSeconds(60))
            .build();
}

The standard

(or provider

) field selects the dialect:

| Standard | Example apiUrl | Models | |---|---|---| openai (default) | https://api.openai.com/v1/chat/completions | GPT, DeepSeek, Qwen, GLM, Kimi, etc. | ollama | http://127.0.0.1:11434/api/chat | Any local Ollama model | anthropic | https://api.anthropic.com/v1/messages | Claude | gemini | https://generativelanguage.googleapis.com/v1beta/models/... | Gemini | dashscope | Aliyun DashScope endpoint | Qwen (DashScope native) |

The most basic use case β€” send a prompt and get a full response:

import org.noear.solon.ai.chat.ChatModel;
import org.noear.solon.ai.chat.ChatResponse;
import org.noear.solon.annotation.Inject;
import org.noear.solon.annotation.Component;

@Component
public class ChatService {
    @Inject
    ChatModel chatModel;

    public String ask(String question) throws IOException {
        ChatResponse resp = chatModel.prompt(question).call();
        return resp.getMessage().getContent();
    }
}

That's it. Three lines of business code.

For chatbots and assistants, streaming is essential. ChatModel

returns a Reactor Flux<ChatResponse>

:

import reactor.core.publisher.Flux;

public Flux<String> askStream(String question) throws IOException {
    return chatModel.prompt(question)
            .stream()
            .filter(ChatResponse::hasContent)       // skip empty chunks
            .map(resp -> resp.getMessage().getContent());
}

You can then subscribe, or β€” if you're using Solon Web Reactive β€” return the Flux

directly to an SSE endpoint:

import org.noear.solon.web.sse.SseEvent;
import org.noear.solon.annotation.Mapping;
import reactor.core.publisher.Flux;

@Mapping("/chat/stream")
public Flux<SseEvent> chatStream(String prompt) throws IOException {
    return chatModel.prompt(prompt)
            .stream()
            .filter(ChatResponse::hasContent)
            .map(resp -> new SseEvent()
                    .data(resp.getMessage().getContent()));
}

The streaming protocol uses standard SSE (text/event-stream

) or x-ndjson

depending on the provider.

LLMs are stateless. To maintain conversation context, you need to pass history with each request. ChatSession

handles this automatically.

import org.noear.solon.ai.chat.ChatSession;
import org.noear.solon.ai.chat.session.InMemoryChatSession;

ChatSession session = InMemoryChatSession.builder()
        .sessionId("user-123")
        .maxMessages(10)     // keep last 10 turns
        .build();

// First turn
ChatResponse resp1 = chatModel.prompt("Hello!")
        .session(session)
        .call();

// Second turn β€” model remembers context
ChatResponse resp2 = chatModel.prompt("What did I just say?")
        .session(session)
        .call();

In a real web app, you'll want one session per user. Here's a controller that does exactly that:

import org.noear.solon.annotation.Controller;
import org.noear.solon.web.sse.SseEvent;
import reactor.core.publisher.Flux;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

@Controller
public class ChatController {
    @Inject
    ChatModel chatModel;

    final Map<String, ChatSession> sessionMap = new ConcurrentHashMap<>();

    @Mapping("/chat")
    public Flux<SseEvent> chat(String sessionId, String prompt) throws IOException {
        ChatSession session = sessionMap.computeIfAbsent(sessionId,
                k -> InMemoryChatSession.builder().sessionId(k).build());

        return chatModel.prompt(prompt)
                .session(session)
                .options(o -> o.systemPrompt("You are a helpful and friendly assistant."))
                .stream()
                .filter(ChatResponse::hasContent)
                .map(resp -> new SseEvent().data(resp.getMessage().getContent()));
    }
}
Implementation Storage Use Case
InMemoryChatSession
Local Map Dev, single-node
FileChatSession
File system CLI tools, desktop apps
RedisChatSession
Redis Production, distributed

Control model behavior per-request with ChatOptions

:

chatModel.prompt("Write a poem about Java")
        .options(o -> o
            .temperature(0.8)
            .max_tokens(500)
            .top_p(0.9)
            .systemPrompt("You are a creative poet."))
        .call();

Common options include:

Method Description
temperature(val)
Sampling temperature (0.0–2.0)
max_tokens(val)
Max output tokens
top_p(val)
Nucleus sampling
top_k(val)
Top-K sampling
frequency_penalty(val)
Reduce repetition
presence_penalty(val)
Encourage new topics
tool_choice(val)
Force tool use: none , auto , required , or tool name
systemPrompt(val)
System message for this request
role(val)
Agent role (v3.9.1+)
instruction(val)
Agent instruction (v3.9.1+)

Sometimes you need more than a simple string. Use Prompt

and ChatMessage

:

import org.noear.solon.ai.chat.Prompt;
import org.noear.solon.ai.chat.message.ChatMessage;

Prompt prompt = Prompt.of(
    ChatMessage.ofSystem("You translate English to French."),
    ChatMessage.ofUser("Hello, how are you?"),
    ChatMessage.ofAssistant("Bonjour, comment allez-vous?"),
    ChatMessage.ofUser("What is your name?")
);

ChatResponse resp = chatModel.prompt(prompt).call();

Let's build a simple knowledge-aware chatbot β€” the kind of RAG-lite pattern you see in real projects. This example uses ChatMessage.ofUserAugment()

to inject context into the prompt:

import org.noear.solon.ai.chat.ChatModel;
import org.noear.solon.ai.chat.ChatResponse;
import org.noear.solon.ai.chat.message.ChatMessage;
import org.noear.solon.annotation.Component;
import org.noear.solon.annotation.Inject;

@Component
public class KnowledgeChatbot {
    @Inject
    ChatModel chatModel;

    public String answer(String question, String referenceContext) throws Exception {
        // Augment the user message with reference context
        ChatMessage augmented = ChatMessage.ofUserAugment(question, referenceContext);

        ChatResponse resp = chatModel.prompt(augmented)
                .options(o -> o
                    .temperature(0.3)
                    .systemPrompt("You are a knowledgeable assistant. Answer based on the provided references."))
                .call();

        return resp.getMessage().getContent();
    }
}

This pattern β€” augment user input with context, then call the model β€” is the foundation of RAG (Retrieval-Augmented Generation) in Solon AI.

ChatModel

is just the entry point. Solon AI also offers:

@ToolMapping

methods the LLM can invokeReActAgent

and TeamAgent

for multi-step reasoningFor the full documentation, check out the official Solon AI guide:

πŸ‘‰ https://solon.noear.org/article/918 (Model construction)

πŸ‘‰ https://solon.noear.org/article/920 (API reference)

Have you tried integrating LLMs in Java? What's your biggest pain point? Let me know in the comments β€” I might cover it in the next post.

── more in #large-language-models 4 stories Β· sorted by recency
── more on @solon 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/solon-4-0-chatmodel-…] indexed:0 read:5min 2026-07-04 Β· β€”