How to Build a Self-Hosted AI Gateway With LiteLLM and Open WebUI To build a self-hosted AI gateway using LiteLLM and Open WebUI to simplify management of multiple AI providers. By placing LiteLLM as a single gateway between user interfaces and various AI services (OpenAI, Anthropic, Groq, Ollama), the system allows providers to be swapped or added without rewriting application code. The open-source implementation can be deployed in under 30 minutes using Docker Compose, with configuration files for routing, authentication, and provider credentials. If you've ever self-hosted AI tools, you know how quickly things get messy. One app talks to OpenAI. Another uses Anthropic. You spin up Ollama locally and now there's a third endpoint to manage. Authentication is different everywhere. Switching models means rewriting integration code. And before long, you're spending more time maintaining glue code than actually building anything. I ran into this exact problem — so I built a cleaner setup. The idea is simple: put a single gateway in front of every provider, so the rest of your stack only ever talks to one API. I open-sourced the full working implementation here: Clone it. Run docker compose up . You'll have a working AI gateway in under 30 minutes. What This Stack Does - One API → OpenAI, Anthropic, Groq, and Ollama all behind a single OpenAI-compatible endpoint - One frontend → Open WebUI as the unified chat interface - Secure remote access → Cloudflare Tunnel, no exposed ports, no open firewall rules - Easy to maintain → providers can change underneath without touching your apps The full setup takes roughly 20–45 minutes depending on Docker image downloads and whether you already have local Ollama models installed. The Stack Nothing exotic here — just well-composed tools: | Component | Role | |---|---| LiteLLM | Gateway / routing layer | Open WebUI | Chat frontend | PostgreSQL | State + metadata | Docker Compose | Orchestration | Cloudflare Tunnel | Secure remote exposure | Architecture User ↓ Open WebUI ↓ LiteLLM gateway ↓ OpenAI / Anthropic / Groq / Ollama The key insight: your apps and frontend only talk to LiteLLM. Providers become interchangeable underneath. Add a new model, swap a provider, change routing — nothing else needs to know. Who This Is For - Developers experimenting with local AI infrastructure - Teams consolidating multiple providers behind one API layer - Engineers building internal AI tooling - Anyone tired of maintaining separate provider integrations If you've ever thought "there has to be a simpler way to manage all these AI endpoints" — this is that simpler way. Why This Stack Exists Most self-hosted AI environments become hard to manage surprisingly fast. One application talks directly to OpenAI. Another uses Anthropic separately. Local Ollama models need their own endpoints. Authentication is inconsistent, and model switching slowly turns into infrastructure sprawl. By placing LiteLLM in front of every provider, the rest of your system only needs to understand one interface. Providers can change, local models can be added, routing logic can evolve — without rewriting frontend or application logic every time. Prerequisites Before starting containers, make sure you have: - Docker Desktop - Docker Compose curl cloudflared - Ollama optional, for local models Quick verification: docker --version docker compose version cloudflared --version curl --version If using local Ollama models: ollama list If installed models appear, local inference is ready. Repository Structure The repo is intentionally lightweight: ├── Docker-compose.yml ├── litellm-config.yml └── .env Each file has a distinct job: Docker Compose orchestrates services, LiteLLM config handles routing and model aliases, .env stores secrets and runtime configuration. Setting Up Environment Variables Your .env file is where provider credentials live. Create or update it in the project root: LITELLM MASTER KEY=sk-very-strong-key OPENAI API KEY=... ANTHROPIC API KEY=... GROQ API KEY=... OLLAMA CLOUD API BASE=https://