{"slug": "i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-here-s-how", "title": "I Built a Production-Oriented Multi-Provider AI Chatbot in Rust — Here's How", "summary": "A developer built a production-oriented multi-provider AI chatbot backend in Rust, unifying Claude, OpenAI, and Ollama behind a single interface. The project features a clean five-module architecture with a `ChatClient` that dispatches provider-specific API calls, using `Arc<Mutex<...>>` for safe shared conversation state across async handlers. The backend includes a Web UI, CLI mode, and Docker support, with the key insight being separate code paths for Anthropic's native API format versus the OpenAI-compatible schema used by OpenAI and Ollama.", "body_md": "Most AI chatbot tutorials reach for Python. FastAPI, LangChain, a quick `requests.post`\n\n— done in 20 minutes. And that's fine for prototyping. But when I wanted to build something I'd actually put behind a real API — something with proper async concurrency, typed errors, and zero GC pauses — I reached for Rust instead.\n\nThis is a writeup of ** chatbot**, a production-oriented Rust backend that unifies Claude, OpenAI, and Ollama behind a single interface — with a Web UI, CLI mode, and Docker support baked in.\n\nIt's a fair question. LLM API calls are network-bound, so why does the backend language even matter?\n\nA few reasons:\n\nFor LLM apps specifically: yes, 95% of your wall-clock time is waiting for the model to respond. But the other 5% — routing, state management, provider selection, connection handling — is all yours to control. Rust makes that part bulletproof.\n\n`http://localhost:8080`\n\nby default, or runs in \n\n```\nUser (Browser or CLI)\n        │\n        ▼\n  Axum HTTP Server (web.rs)\n        │\n        ├──▶ Conversation State (Arc<Mutex<Vec<Message>>>)\n        │\n        └──▶ Runtime Config (config.rs)\n                    │\n                    ▼\n            ChatClient (client.rs)\n                    │\n           ┌────────┼────────┐\n           ▼        ▼        ▼\n        Claude   OpenAI   Ollama\n         API      API      API\n```\n\nThe project has a clean five-module layout in `src/`\n\n:\n\n```\nsrc/\n├── main.rs          # Startup routing + CLI loop\n├── config.rs        # Provider enum + env/runtime config\n├── client.rs        # Provider-specific HTTP clients\n├── conversation.rs  # In-memory chat state model\n└── web.rs           # Axum routes, connect flow, chat API\n```\n\nEach module has exactly one responsibility. No god objects, no tangled imports.\n\nThe heart of the project is `client.rs`\n\n. Instead of sprinkling provider-specific logic everywhere, all outbound AI calls go through a single `ChatClient`\n\nthat dispatches based on the active provider.\n\nThe key insight: **Claude uses the Anthropic native API format, while OpenAI and Ollama both speak the OpenAI-compatible schema.** Separating these two code paths keeps the provider logic honest — you're not faking compatibility where there isn't any.\n\n```\n// Simplified concept from client.rs\npub enum Provider {\n    Claude,\n    OpenAI,\n    Ollama,\n}\n\npub struct ChatClient {\n    pub provider: Provider,\n    pub model: String,\n    pub base_url: String,\n    pub api_key: Option<String>,\n    pub max_tokens: u32,\n    pub system_prompt: String,\n    pub http: reqwest::Client,\n}\n```\n\nWhen you send a message, the client picks the right HTTP contract:\n\n``` php\npub async fn send(&self, messages: &[Message]) -> Result<String> {\n    match self.provider {\n        Provider::Claude => self.send_claude(messages).await,\n        Provider::OpenAI | Provider::Ollama => self.send_openai_compat(messages).await,\n    }\n}\n```\n\nThis means adding a new provider (Gemini, Cohere, etc.) in the future is a matter of adding one arm and one method — the rest of the application stays untouched.\n\nMulti-turn chat requires persistent message history. In Rust's async model, sharing state across request handlers requires explicit synchronization. The project does this with `Arc<Mutex<...>>`\n\n, Rust's standard pattern for shared mutable state:\n\n```\n// conversation.rs - shared across all handlers\npub type SharedConversation = Arc<Mutex<Vec<Message>>>;\n\n#[derive(Clone, Debug, Serialize, Deserialize)]\npub struct Message {\n    pub role: String,    // \"user\" or \"assistant\"\n    pub content: String,\n}\n```\n\nThe `Arc`\n\nmakes the conversation cloneable across Axum handlers (each handler runs in its own async task), and the `Mutex`\n\nensures only one handler touches the history at a time. No race conditions, guaranteed by the type system.\n\nThe app launches in Web UI mode by default, but also supports a terminal workflow:\n\n```\n# Default: serves Web UI at http://localhost:8080\ncargo run\n\n# CLI mode: interactive terminal chat\ncargo run -- cli\n\n# Explicit web mode on a custom port\nPORT=3000 cargo run -- web\n```\n\nThis is useful in different contexts — the Web UI for demos and sharing, the CLI for scripting and piping into other tools.\n\nDownload the [prebuilt Windows executable (v1.0.1)](https://github.com/MihirMohapatra/chatbot/releases/download/v1.0.1/chatbot.exe) and run it:\n\n```\n.\\chatbot.exe\n# Opens http://localhost:8080\n# 1. Install Rust\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\n\n# 2. Clone and configure\ngit clone https://github.com/MihirMohapatra/chatbot.git\ncd chatbot\ncp .env.example .env\n\n# 3. Run\ncargo run\n```\n\nSet your provider in `.env`\n\n:\n\n```\nPROVIDER=claude\nANTHROPIC_API_KEY=sk-ant-...\nPROVIDER=openai\nOPENAI_API_KEY=sk-...\n# Pull a model first\nollama pull llama3.2\nPROVIDER=ollama\nMODEL=llama3.2\nPROVIDER=openai\nOPENAI_API_KEY=sk-or-...\nBASE_URL=https://openrouter.ai/api\nMODEL=anthropic/claude-sonnet-4\n```\n\nAll environment variables:\n\n| Variable | Default | Description |\n|---|---|---|\n`PROVIDER` |\n`claude` |\n`claude` , `openai` , `ollama`\n|\n`ANTHROPIC_API_KEY` |\n— | Required for Claude |\n`OPENAI_API_KEY` |\n— | Required for OpenAI |\n`MODEL` |\nprovider default | Override the model name |\n`BASE_URL` |\nprovider default | Override the API endpoint |\n`MAX_TOKENS` |\n`1024` |\nResponse token cap |\n`SYSTEM_PROMPT` |\nbuilt-in | Custom assistant behavior |\n\n```\ndocker build -t chatbot .\ndocker run -it --rm -v .env:/data/.env chatbot\n```\n\nFor Ollama with host networking:\n\n```\ndocker run -it --rm --network host -v .env:/data/.env chatbot\n```\n\nFor backend workloads like this (concurrent HTTP + JSON + state management), Rust consistently outperforms Python across the metrics that matter in production:\n\n| Metric | Rust | Python | Why It Matters |\n|---|---|---|---|\n| Throughput (req/sec) | Higher | Lower | More concurrent users per instance |\n| P95/P99 latency | Lower under load | Higher under load | More stable response times |\n| Memory per worker | Lower | Higher | Better infra cost and density |\n| CPU efficiency | Higher | Lower | More headroom before scaling out |\n\nNote: For LLM apps, model/API network time dominates total latency. But Rust still wins on concurrency behavior, memory footprint, and server efficiency — which directly impacts cost and reliability at scale.\n\n`tracing`\n\ncrate + OpenTelemetry integrationThe project is open source and MIT licensed:\n\n👉 [github.com/MihirMohapatra/chatbot](https://github.com/MihirMohapatra/chatbot)\n\nIf you're exploring Rust for backend systems, or building something that needs to talk to multiple AI providers without writing boilerplate for each one, this is a good starting point. Issues and PRs welcome.\n\n*Built with Rust 1.80+, Tokio, Axum, reqwest, serde, and anyhow.*", "url": "https://wpnews.pro/news/i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-here-s-how", "canonical_source": "https://dev.to/mihir_mohapatra/i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-heres-how-1i44", "published_at": "2026-05-31 05:49:53+00:00", "updated_at": "2026-05-31 06:11:44.827155+00:00", "lang": "en", "topics": ["large-language-models", "ai-infrastructure", "ai-tools", "ai-products", "artificial-intelligence"], "entities": ["Rust", "Claude", "OpenAI", "Ollama", "Axum", "Docker", "FastAPI", "LangChain"], "alternates": {"html": "https://wpnews.pro/news/i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-here-s-how", "markdown": "https://wpnews.pro/news/i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-here-s-how.md", "text": "https://wpnews.pro/news/i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-here-s-how.txt", "jsonld": "https://wpnews.pro/news/i-built-a-production-oriented-multi-provider-ai-chatbot-in-rust-here-s-how.jsonld"}}