Want a ChatGPT-like experience that runs entirely on your own GPU? No monthly fees, no data leaving your machine, and it works offline. Here's how to set it up in 15 minutes.
#
What You'll Build
- A full ChatGPT-style web UI running locally
- Your choice of open-source LLM (Qwen3 14B or Llama 3.1 8B)
- Multiple user accounts for your LAN
- 100% private - nothing leaves your network
#
Prerequisites
-
A GPU with 12GB+ VRAM (RTX 3060 12GB works great)
-
Docker + Docker Compose installed
-
NVIDIA Container Toolkit for GPU passthrough (Linux) or WSL2 (Windows)
#
Setup
Create a docker-compose.yml
file:
#
Run It
Open [http://localhost:3000](http://localhost:3000), create your admin account, pick `qwen3:14b`
from the dropdown, and start chatting.
#
What Makes It Great
$0/month vs $20/month for ChatGPT Plus #
Full privacy - conversations stay on your machine #
Works offline - no internet connection needed after setup #
Multi-user - share with family or your team on the same LAN #
Model switching - swap between different models mid-conversation
#
Performance
On an RTX 3060 12GB with Qwen3 14B (Q4): ~20-25 tok/s, smooth for chat. For 8GB cards, use Llama 3.1 8B instead.
*Originally published on *everylocalai.com