{"slug": "show-hn-smart-model-routing-directly-in-claude-codex-and-cursor", "title": "Show HN: Smart model routing directly in Claude, Codex and Cursor", "summary": "Weave released a smart model routing proxy that works with Claude, Codex, and Cursor, ranking #1 on the RouterArena leaderboard. The open-source tool uses an on-box embedder to select the best model per request across Anthropic, OpenAI, Gemini, and open-source providers, with support for streaming, tools, and vision. It is available as a hosted service via npx or self-hosted with Docker and Postgres.", "body_md": "**One endpoint. Every model. Always the right one.**\n\nA drop-in proxy for Anthropic, OpenAI, and Gemini that picks the best model\nfor *every* request: using a tiny on-box embedder, not a vibes-based prompt.\n\n**🥇 #1 on the RouterArena leaderboard**\n\n— Acc-Cost Arena\n\n[1](#user-content-fn-2-7b8ef23446551177800f80285e622e2b)**76.09**.\n\n*Built by Weave: The #1 engineering intelligence platform,\nloved by Robinhood, PostHog, Reducto, and hundreds of others.*\n\nPoint Claude Code, Codex, Cursor, or your own app at `localhost:8080`\n\n. The router:\n\n- 🎯\n**Routes per request.** A cluster scorer derived from[Avengers-Pro](https://arxiv.org/abs/2508.12631)picks the right model from your enabled providers, every turn.[2](#user-content-fn-1-7b8ef23446551177800f80285e622e2b) - 🔌\n**Speaks everyone's API.** Anthropic Messages, OpenAI Chat Completions, Gemini native. Streaming, tools, vision, the works. - 🧠\n**Knows OSS too.** DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via OpenRouter (or any OpenAI-compatible endpoint). - 🔒\n**BYOK by default.** Provider keys stay on your box, encrypted at rest. - 📊\n**Observable.** OTLP traces out of the box. See your dashboard in the Weave dashboard ([http://localhost:8080/ui/dashboard](http://localhost:8080/ui/dashboard)) or drop in Honeycomb, Datadog, Grafana, whatever.\n\nThe fastest way: point Claude Code, Codex, or opencode at the **hosted**\nWeave Router with one command. No clone, no Docker, no Postgres.\n\n```\nnpx @workweave/router\n```\n\nThat's it. The installer asks which tool (Claude Code, Codex, or opencode), walks you through scope (user vs. project), grabs a router key, and wires the right config file. Other flavors:\n\n```\nnpx @workweave/router --claude              # skip the picker, Claude Code\nnpx @workweave/router --codex               # skip the picker, OpenAI Codex CLI\nnpx @workweave/router --opencode            # skip the picker, opencode\nnpx @workweave/router --scope project       # per-repo, commits settings.json (or .codex/ / opencode.json)\nnpx @workweave/router --local               # self-hosted localhost:8080\nnpx @workweave/router --base-url https://router.acme.internal\nnpx @workweave/router@0.1.0                 # pin a version\n```\n\nRequires Node ≥ 18 (Claude Code and opencode paths also need `jq`\n\n). Full\nflag reference: [install/npm/README.md](/workweave/router/blob/main/install/npm/README.md).\n\nIf you want the router (and dashboard) running on your own box:\n\n```\n# 1. Drop a provider key in. OpenRouter is the recommended baseline.\necho \"OPENROUTER_API_KEY=sk-or-v1-...\" >> .env.local\n\n# 2. Boot Postgres + router on :8080 and seed an rk_ key.\nmake full-setup\n```\n\nThe router is up at [http://localhost:8080](http://localhost:8080), the dashboard at\n[http://localhost:8080/ui/](http://localhost:8080/ui/) (password: `admin`\n\n), and your `rk_...`\n\nkey\nprints in the logs.\n\n```\n# Call it like Anthropic\ncurl -sS http://localhost:8080/v1/messages \\\n  -H \"Authorization: Bearer rk_...\" \\\n  -d '{\"model\":\"claude-sonnet-4-5\",\"max_tokens\":256,\n       \"messages\":[{\"role\":\"user\",\"content\":\"hi\"}]}'\n\n# ...or like OpenAI\ncurl -sS http://localhost:8080/v1/chat/completions \\\n  -H \"Authorization: Bearer rk_...\" \\\n  -d '{\"model\":\"gpt-4o-mini\",\n       \"messages\":[{\"role\":\"user\",\"content\":\"hi\"}]}'\n\n# Peek at the routing decision without proxying\ncurl -sS http://localhost:8080/v1/route -H \"Authorization: Bearer rk_...\" -d '...'\n```\n\n**Claude Code.** Run `make install-cc`\n\nto wire Claude Code at the local\nself-hosted router (it's also invoked automatically at the end of\n`make full-setup`\n\n). For the hosted router, use `npx @workweave/router`\n\nabove.\n\n**Codex** (OpenAI CLI). `npx @workweave/router --codex`\n\npatches\n`~/.codex/config.toml`\n\n(or `<repo>/.codex/config.toml`\n\nwith `--scope project`\n\n)\nwith a managed `[model_providers.weave]`\n\nblock and sets `model_provider = \"weave\"`\n\n.\nCodex's existing `OPENAI_API_KEY`\n\nflows through to api.openai.com for the\nplan-based passthrough; the router key rides in an `X-Weave-Router-Key`\n\nHTTP\nheader. Re-install and `--uninstall --codex`\n\nrewrite/remove only the managed\nblock, leaving the rest of your Codex config untouched.\n\n**opencode.** `npx @workweave/router --opencode`\n\nmerges a `provider.weave`\n\nentry into `~/.config/opencode/opencode.json`\n\n(or `<repo>/opencode.json`\n\nwith `--scope project`\n\n). It uses opencode's bundled `@ai-sdk/anthropic`\n\nprovider pointed at the router's `/v1`\n\nendpoint — the router speaks the\nAnthropic Messages API natively, so opencode works unmodified. The router\nkey and identity headers ride alongside the provider config; re-install\nrewrites only the managed block and `--uninstall --opencode`\n\nstrips it.\n\n**Cursor** *(early beta, performance may not be the best).* Settings →\nModels → *Override OpenAI Base URL* → `http://localhost:8080/v1`\n\n, paste\n`rk_...`\n\nas the API key.\n\n**Switching on/off.** After installing, `npx @workweave/router off --claude`\n\n(or `--codex`\n\n/ `--opencode`\n\n) routes that client straight to its provider\nagain without discarding the router config; `on`\n\nflips it back, and `status`\n\nreports which way it's pointing. Claude Code also gets `/router-off`\n\n,\n`/router-on`\n\n, and `/router-status`\n\nslash commands. Cursor toggles via the same\nSettings → Models override above. See [install/README.md](/workweave/router/blob/main/install/README.md#switching-on-and-off).\n\nTwo keys, don't mix them up:\n\n`sk-or-...`\n\n/`sk-ant-...`\n\n/`sk-...`\n\n= yourupstreamprovider key. Lives in`.env.local`\n\n.`rk_...`\n\n= yourrouterkey. Clients send this as a Bearer token.\n\n| Endpoint | Format |\n|---|---|\n`POST /v1/messages` |\nAnthropic Messages, routed |\n`POST /v1/chat/completions` |\nOpenAI Chat Completions, routed |\n`POST /v1beta/models/:action` |\nGemini `generateContent` , routed |\n`POST /v1/route` |\nReturns the decision, no upstream call |\n`GET /v1/models` · `POST /v1/messages/count_tokens` |\nAnthropic passthrough |\n`GET /health` · `GET /validate` |\nliveness + key check |\n\n- 📐\n: every env var, BYOK encryption, OTel knobs, cluster routing.**Configuration reference** - 🛠️\n: layering rules, hot-reload dev, migrations, tests, the whole engineering loop.**Contributing** - 🏗️\n: package layout, import contracts, recipes for adding endpoints / providers / strategies.**Architecture**\n\n- Token-aware rate limiting (Redis sliding window per installation)\n- Sub-installations for tenant hierarchies\n- Speculative dispatch + hedging for tail latency\n\n## Footnotes\n\n-\nLu, Y., Liu, R., Yuan, J., Cui, X., Zhang, S., Liu, H., & Xing, J.\n\n*RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers.*arXiv:2510.00202, 2025.[https://arxiv.org/abs/2510.00202](https://arxiv.org/abs/2510.00202)[↩](#user-content-fnref-2-7b8ef23446551177800f80285e622e2b) -\nZhang, Y. et al.\n\n*Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing*(Avengers-Pro). arXiv:2508.12631, 2025.[https://arxiv.org/abs/2508.12631](https://arxiv.org/abs/2508.12631)[↩](#user-content-fnref-1-7b8ef23446551177800f80285e622e2b)", "url": "https://wpnews.pro/news/show-hn-smart-model-routing-directly-in-claude-codex-and-cursor", "canonical_source": "https://github.com/workweave/router", "published_at": "2026-06-26 16:40:11+00:00", "updated_at": "2026-06-26 17:03:20.323689+00:00", "lang": "en", "topics": ["ai-tools", "ai-infrastructure", "developer-tools", "large-language-models", "machine-learning"], "entities": ["Weave", "Anthropic", "OpenAI", "Gemini", "OpenRouter", "Robinhood", "PostHog", "Reducto"], "alternates": {"html": "https://wpnews.pro/news/show-hn-smart-model-routing-directly-in-claude-codex-and-cursor", "markdown": "https://wpnews.pro/news/show-hn-smart-model-routing-directly-in-claude-codex-and-cursor.md", "text": "https://wpnews.pro/news/show-hn-smart-model-routing-directly-in-claude-codex-and-cursor.txt", "jsonld": "https://wpnews.pro/news/show-hn-smart-model-routing-directly-in-claude-codex-and-cursor.jsonld"}}