One endpoint. Every model. Always the right one.
A drop-in proxy for Anthropic, OpenAI, and Gemini that picks the best model for every request: using a tiny on-box embedder, not a vibes-based prompt.
🥇 #1 on the RouterArena leaderboard
— Acc-Cost Arena
176.09.
Built by Weave: The #1 engineering intelligence platform, loved by Robinhood, PostHog, Reducto, and hundreds of others.
Point Claude Code, Codex, Cursor, or your own app at localhost:8080
. The router:
- 🎯 Routes per request. A cluster scorer derived fromAvengers-Propicks the right model from your enabled providers, every turn.2 - 🔌 Speaks everyone's API. Anthropic Messages, OpenAI Chat Completions, Gemini native. Streaming, tools, vision, the works. - 🧠 Knows OSS too. DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via OpenRouter (or any OpenAI-compatible endpoint). - 🔒 BYOK by default. Provider keys stay on your box, encrypted at rest. - 📊 Observable. OTLP traces out of the box. See your dashboard in the Weave dashboard (http://localhost:8080/ui/dashboard) or drop in Honeycomb, Datadog, Grafana, whatever.
The fastest way: point Claude Code, Codex, or opencode at the hosted Weave Router with one command. No clone, no Docker, no Postgres.
npx @workweave/router
That's it. The installer asks which tool (Claude Code, Codex, or opencode), walks you through scope (user vs. project), grabs a router key, and wires the right config file. Other flavors:
npx @workweave/router --claude # skip the picker, Claude Code
npx @workweave/router --codex # skip the picker, OpenAI Codex CLI
npx @workweave/router --opencode # skip the picker, opencode
npx @workweave/router --scope project # per-repo, commits settings.json (or .codex/ / opencode.json)
npx @workweave/router --local # self-hosted localhost:8080
npx @workweave/router --base-url https://router.acme.internal
npx @workweave/router@0.1.0 # pin a version
Requires Node ≥ 18 (Claude Code and opencode paths also need jq
). Full flag reference: install/npm/README.md.
If you want the router (and dashboard) running on your own box:
echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env.local
make full-setup
The router is up at http://localhost:8080, the dashboard at
http://localhost:8080/ui/ (password: admin
), and your rk_...
key prints in the logs.
curl -sS http://localhost:8080/v1/messages \
-H "Authorization: Bearer rk_..." \
-d '{"model":"claude-sonnet-4-5","max_tokens":256,
"messages":[{"role":"user","content":"hi"}]}'
curl -sS http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer rk_..." \
-d '{"model":"gpt-4o-mini",
"messages":[{"role":"user","content":"hi"}]}'
curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..." -d '...'
Claude Code. Run make install-cc
to wire Claude Code at the local
self-hosted router (it's also invoked automatically at the end of
make full-setup
). For the hosted router, use npx @workweave/router
above.
Codex (OpenAI CLI). npx @workweave/router --codex
patches
~/.codex/config.toml
(or <repo>/.codex/config.toml
with --scope project
)
with a managed [model_providers.weave]
block and sets model_provider = "weave"
.
Codex's existing OPENAI_API_KEY
flows through to api.openai.com for the
plan-based passthrough; the router key rides in an X-Weave-Router-Key
HTTP
header. Re-install and --uninstall --codex
rewrite/remove only the managed block, leaving the rest of your Codex config untouched.
opencode. npx @workweave/router --opencode
merges a provider.weave
entry into ~/.config/opencode/opencode.json
(or <repo>/opencode.json
with --scope project
). It uses opencode's bundled @ai-sdk/anthropic
provider pointed at the router's /v1
endpoint — the router speaks the
Anthropic Messages API natively, so opencode works unmodified. The router
key and identity headers ride alongside the provider config; re-install
rewrites only the managed block and --uninstall --opencode
strips it.
Cursor (early beta, performance may not be the best). Settings →
Models → Override OpenAI Base URL → http://localhost:8080/v1
, paste
rk_...
as the API key.
Switching on/off. After installing, npx @workweave/router off --claude
(or --codex
/ --opencode
) routes that client straight to its provider
again without discarding the router config; on
flips it back, and status
reports which way it's pointing. Claude Code also gets /router-off
,
/router-on
, and /router-status
slash commands. Cursor toggles via the same Settings → Models override above. See install/README.md.
Two keys, don't mix them up:
sk-or-...
/sk-ant-...
/sk-...
= yourupstreamprovider key. Lives in.env.local
.rk_...
= yourrouterkey. Clients send this as a Bearer token.
| Endpoint | Format |
|---|---|
POST /v1/messages |
|
| Anthropic Messages, routed | |
POST /v1/chat/completions |
|
| OpenAI Chat Completions, routed | |
POST /v1beta/models/:action |
|
Gemini generateContent , routed |
|
POST /v1/route |
|
| Returns the decision, no upstream call | |
GET /v1/models · POST /v1/messages/count_tokens |
|
| Anthropic passthrough | |
GET /health · GET /validate |
|
| liveness + key check |
-
📐 : every env var, BYOK encryption, OTel knobs, cluster routing.Configuration reference - 🛠️ : layering rules, hot-reload dev, migrations, tests, the whole engineering loop.Contributing - 🏗️ : package layout, import contracts, recipes for adding endpoints / providers / strategies.Architecture
-
Token-aware rate limiting (Redis sliding window per installation)
-
Sub-installations for tenant hierarchies
-
Speculative dispatch + hedging for tail latency
Footnotes #
Lu, Y., Liu, R., Yuan, J., Cui, X., Zhang, S., Liu, H., & Xing, J.
*RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers.*arXiv:2510.00202, 2025.https://arxiv.org/abs/2510.00202↩ - Zhang, Y. et al.
Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing(Avengers-Pro). arXiv:2508.12631, 2025.https://arxiv.org/abs/2508.12631↩