# Show HN: Smart model routing directly in Claude, Codex and Cursor

> Source: <https://github.com/workweave/router>
> Published: 2026-06-26 16:40:11+00:00

**One endpoint. Every model. Always the right one.**

A drop-in proxy for Anthropic, OpenAI, and Gemini that picks the best model
for *every* request: using a tiny on-box embedder, not a vibes-based prompt.

**🥇 #1 on the RouterArena leaderboard**

— Acc-Cost Arena

[1](#user-content-fn-2-7b8ef23446551177800f80285e622e2b)**76.09**.

*Built by Weave: The #1 engineering intelligence platform,
loved by Robinhood, PostHog, Reducto, and hundreds of others.*

Point Claude Code, Codex, Cursor, or your own app at `localhost:8080`

. The router:

- 🎯
**Routes per request.** A cluster scorer derived from[Avengers-Pro](https://arxiv.org/abs/2508.12631)picks the right model from your enabled providers, every turn.[2](#user-content-fn-1-7b8ef23446551177800f80285e622e2b) - 🔌
**Speaks everyone's API.** Anthropic Messages, OpenAI Chat Completions, Gemini native. Streaming, tools, vision, the works. - 🧠
**Knows OSS too.** DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via OpenRouter (or any OpenAI-compatible endpoint). - 🔒
**BYOK by default.** Provider keys stay on your box, encrypted at rest. - 📊
**Observable.** OTLP traces out of the box. See your dashboard in the Weave dashboard ([http://localhost:8080/ui/dashboard](http://localhost:8080/ui/dashboard)) or drop in Honeycomb, Datadog, Grafana, whatever.

The fastest way: point Claude Code, Codex, or opencode at the **hosted**
Weave Router with one command. No clone, no Docker, no Postgres.

```
npx @workweave/router
```

That's it. The installer asks which tool (Claude Code, Codex, or opencode), walks you through scope (user vs. project), grabs a router key, and wires the right config file. Other flavors:

```
npx @workweave/router --claude              # skip the picker, Claude Code
npx @workweave/router --codex               # skip the picker, OpenAI Codex CLI
npx @workweave/router --opencode            # skip the picker, opencode
npx @workweave/router --scope project       # per-repo, commits settings.json (or .codex/ / opencode.json)
npx @workweave/router --local               # self-hosted localhost:8080
npx @workweave/router --base-url https://router.acme.internal
npx @workweave/router@0.1.0                 # pin a version
```

Requires Node ≥ 18 (Claude Code and opencode paths also need `jq`

). Full
flag reference: [install/npm/README.md](/workweave/router/blob/main/install/npm/README.md).

If you want the router (and dashboard) running on your own box:

```
# 1. Drop a provider key in. OpenRouter is the recommended baseline.
echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env.local

# 2. Boot Postgres + router on :8080 and seed an rk_ key.
make full-setup
```

The router is up at [http://localhost:8080](http://localhost:8080), the dashboard at
[http://localhost:8080/ui/](http://localhost:8080/ui/) (password: `admin`

), and your `rk_...`

key
prints in the logs.

```
# Call it like Anthropic
curl -sS http://localhost:8080/v1/messages \
  -H "Authorization: Bearer rk_..." \
  -d '{"model":"claude-sonnet-4-5","max_tokens":256,
       "messages":[{"role":"user","content":"hi"}]}'

# ...or like OpenAI
curl -sS http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer rk_..." \
  -d '{"model":"gpt-4o-mini",
       "messages":[{"role":"user","content":"hi"}]}'

# Peek at the routing decision without proxying
curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..." -d '...'
```

**Claude Code.** Run `make install-cc`

to wire Claude Code at the local
self-hosted router (it's also invoked automatically at the end of
`make full-setup`

). For the hosted router, use `npx @workweave/router`

above.

**Codex** (OpenAI CLI). `npx @workweave/router --codex`

patches
`~/.codex/config.toml`

(or `<repo>/.codex/config.toml`

with `--scope project`

)
with a managed `[model_providers.weave]`

block and sets `model_provider = "weave"`

.
Codex's existing `OPENAI_API_KEY`

flows through to api.openai.com for the
plan-based passthrough; the router key rides in an `X-Weave-Router-Key`

HTTP
header. Re-install and `--uninstall --codex`

rewrite/remove only the managed
block, leaving the rest of your Codex config untouched.

**opencode.** `npx @workweave/router --opencode`

merges a `provider.weave`

entry into `~/.config/opencode/opencode.json`

(or `<repo>/opencode.json`

with `--scope project`

). It uses opencode's bundled `@ai-sdk/anthropic`

provider pointed at the router's `/v1`

endpoint — the router speaks the
Anthropic Messages API natively, so opencode works unmodified. The router
key and identity headers ride alongside the provider config; re-install
rewrites only the managed block and `--uninstall --opencode`

strips it.

**Cursor** *(early beta, performance may not be the best).* Settings →
Models → *Override OpenAI Base URL* → `http://localhost:8080/v1`

, paste
`rk_...`

as the API key.

**Switching on/off.** After installing, `npx @workweave/router off --claude`

(or `--codex`

/ `--opencode`

) routes that client straight to its provider
again without discarding the router config; `on`

flips it back, and `status`

reports which way it's pointing. Claude Code also gets `/router-off`

,
`/router-on`

, and `/router-status`

slash commands. Cursor toggles via the same
Settings → Models override above. See [install/README.md](/workweave/router/blob/main/install/README.md#switching-on-and-off).

Two keys, don't mix them up:

`sk-or-...`

/`sk-ant-...`

/`sk-...`

= yourupstreamprovider key. Lives in`.env.local`

.`rk_...`

= yourrouterkey. Clients send this as a Bearer token.

| Endpoint | Format |
|---|---|
`POST /v1/messages` |
Anthropic Messages, routed |
`POST /v1/chat/completions` |
OpenAI Chat Completions, routed |
`POST /v1beta/models/:action` |
Gemini `generateContent` , routed |
`POST /v1/route` |
Returns the decision, no upstream call |
`GET /v1/models` · `POST /v1/messages/count_tokens` |
Anthropic passthrough |
`GET /health` · `GET /validate` |
liveness + key check |

- 📐
: every env var, BYOK encryption, OTel knobs, cluster routing.**Configuration reference** - 🛠️
: layering rules, hot-reload dev, migrations, tests, the whole engineering loop.**Contributing** - 🏗️
: package layout, import contracts, recipes for adding endpoints / providers / strategies.**Architecture**

- Token-aware rate limiting (Redis sliding window per installation)
- Sub-installations for tenant hierarchies
- Speculative dispatch + hedging for tail latency

## Footnotes

-
Lu, Y., Liu, R., Yuan, J., Cui, X., Zhang, S., Liu, H., & Xing, J.

*RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers.*arXiv:2510.00202, 2025.[https://arxiv.org/abs/2510.00202](https://arxiv.org/abs/2510.00202)[↩](#user-content-fnref-2-7b8ef23446551177800f80285e622e2b) -
Zhang, Y. et al.

*Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing*(Avengers-Pro). arXiv:2508.12631, 2025.[https://arxiv.org/abs/2508.12631](https://arxiv.org/abs/2508.12631)[↩](#user-content-fnref-1-7b8ef23446551177800f80285e622e2b)
