{"slug": "run-claude-code-locally-for-free-mlx-serve-on-apple-silicon", "title": "Run Claude Code locally for free: mlx-serve on Apple Silicon", "summary": "A developer released mlx-serve, a native Zig server for MLX-format language models on Apple Silicon, enabling local, free, and private use of AI coding assistants like Claude Code. The server exposes OpenAI, Anthropic, and Ollama-compatible APIs from a single binary, achieving 35% faster decode than LM Studio on Gemma 4 E4B 4-bit. It requires no Python, conda, or Docker, and can be installed via Homebrew.", "body_md": "Claude Code is the best AI coding assistant available right now. But it calls the Anthropic API by default, which adds up fast on long sessions.\n\nWhat if you could run it entirely locally - free, private, and on hardware you already own?\n\n**mlx-serve** makes this possible on any Apple Silicon Mac.\n\nmlx-serve is a native Zig server for MLX-format language models on Apple Silicon. It exposes OpenAI-compatible, Anthropic-compatible, and Ollama-compatible HTTP APIs - all on a single port, from a single binary.\n\n```\nbrew install mlx-serve\n```\n\nThat's it. No Python. No conda. No Docker.\n\nClaude Code looks for `ANTHROPIC_BASE_URL`\n\nand `ANTHROPIC_API_KEY`\n\nin your environment. mlx-serve implements the full Anthropic Messages API, so you just point Claude Code at it:\n\n```\nexport ANTHROPIC_BASE_URL=http://localhost:8080\nexport ANTHROPIC_API_KEY=local\nexport ANTHROPIC_DEFAULT_MODEL=mlx-serve\nmlx-serve --model ~/.mlx-serve/models/mlx-community/gemma-4-e4b-it-4bit --serve\n```\n\nThen launch Claude Code as normal. Streaming, tool calls, thinking blocks - all work.\n\nFull setup guide: [https://mlxserve.com/claude-code-local/](https://mlxserve.com/claude-code-local/)\n\nOn Apple Silicon, mlx-serve achieves 35%+ faster decode than LM Studio on Gemma 4 E4B 4-bit. The server is written in Zig with no Python runtime overhead.\n\n`/api/chat`\n\n, `/api/generate`\n\n, `/api/embed`\n\nendpoints - works with Raycast, Open WebUI, Obsidian", "url": "https://wpnews.pro/news/run-claude-code-locally-for-free-mlx-serve-on-apple-silicon", "canonical_source": "https://dev.to/ddalcu/run-claude-code-locally-for-free-mlx-serve-on-apple-silicon-1m8l", "published_at": "2026-07-03 23:26:39+00:00", "updated_at": "2026-07-04 00:19:06.528175+00:00", "lang": "en", "topics": ["developer-tools", "large-language-models", "ai-infrastructure", "ai-products"], "entities": ["mlx-serve", "Claude Code", "Anthropic", "Apple Silicon", "Zig", "MLX", "Gemma 4", "LM Studio"], "alternates": {"html": "https://wpnews.pro/news/run-claude-code-locally-for-free-mlx-serve-on-apple-silicon", "markdown": "https://wpnews.pro/news/run-claude-code-locally-for-free-mlx-serve-on-apple-silicon.md", "text": "https://wpnews.pro/news/run-claude-code-locally-for-free-mlx-serve-on-apple-silicon.txt", "jsonld": "https://wpnews.pro/news/run-claude-code-locally-for-free-mlx-serve-on-apple-silicon.jsonld"}}