How to Set Up a Local AI Coding Assistant in VS Code – Free & Private A developer has published a guide for setting up a local AI coding assistant in VS Code using Continue and Ollama, achieving tab autocomplete and code chat entirely on-device. The setup requires a GPU with 24GB+ VRAM for the 14B model, but smaller GPUs can use Qwen2.5 Coder 7B. The guide emphasizes zero cost, privacy, offline capability, and model flexibility. Want a Cursor/Copilot-style coding assistant that runs entirely on your machine? Your code never leaves your computer and there's no subscription fee. Here's how to set it up with VS Code, Continue, and Ollama. What You'll Build - Tab autocomplete like Copilot that suggests code as you type - Chat with your codebase - ask questions, generate functions, write tests - 100% local - zero data sent to any cloud service Prerequisites - A GPU with 24GB+ VRAM RTX 3090/4090 or better - For smaller GPUs 8-12GB , use Qwen2.5 Coder 7B instead - Ollama installed see ollama.com - VS Code free from code.visualstudio.com Step 1: Pull the Model Open a terminal and pull a coding-focused model: This takes a few minutes depending on your internet. The model is ~8GB at Q4 quantization. Step 2: Install Continue In VS Code: - Open Extensions Ctrl+Shift+X - Search for "Continue" - Click Install - Reload VS Code when prompted Step 3: Configure Create or edit ~/.continue/config.yaml : Step 4: Use It - Autocomplete : Start typing. Continue suggests completions in gray. Press Tab to accept. - Chat : Press Ctrl+L or Cmd+L on Mac to open the chat panel. Ask questions about your code. - Edit : Select code and press Ctrl+Shift+L to ask for changes. - Inline : Highlight code, press Ctrl+I, and describe what you want changed. Performance Notes | GPU | Model | Speed | Quality | | RTX 3090 24GB | Qwen2.5-Coder 14B | 25-35 tok/s | Excellent | | RTX 4090 24GB | Qwen2.5-Coder 14B | 40-50 tok/s | Excellent | | RTX 3060 12GB | Qwen2.5-Coder 7B | 30-40 tok/s | Good | | RTX 4060 8GB | Qwen2.5-Coder 7B Q4 | 20-30 tok/s | Good | Why Go Local? - $0/month vs $20/seat for Copilot or Cursor - Privacy : your proprietary code never touches a third-party server - Offline : works without internet - Model choice : swap models anytime, no vendor lock-in Originally published on everylocalai.com https://everylocalai.com/stack/local-cursor