Show HN: Use Kimi and OpenAI Subscriptions in Claude Code

wpnews.pro

claude-code-proxy

lets you use Claude Code with your ChatGPT Plus/Pro subscription or your Kimi Code (kimi.com) account.

Quick start · Providers · How it works · Configuration · Limitations

I feel Claude Code is still the best harness around, despite occasional frustrations caused by updates. However, Anthropic keeps tightening the usage limits, while OpenAI is still much more generous.

If you want to use OpenAI plans, your best options seem to be OpenCode and Codex. I tried OpenCode, but the UX has many rough edges, especially around skills feeling like a second-class feature. Fortunately it's open source and I ended up forking it and applying some patches, but would much rather not do it.

Homebrew (macOS and Linux):

brew install raine/claude-code-proxy/claude-code-proxy

Install script (macOS and Linux):

curl -fsSL https://raw.githubusercontent.com/raine/claude-code-proxy/main/scripts/install.sh | bash

Manual: download a prebuilt binary for your platform from the releases page. Windows artifacts are published as claude-code-proxy-windows-amd64.zip

and claude-code-proxy-windows-arm64.zip

; extract the .exe

somewhere on your PATH

.

The proxy supports two upstream providers. Pick one and run its login flow; the proxy will refuse to start traffic until a token is stored.

Codex (ChatGPT Plus/Pro):

claude-code-proxy codex auth login     # browser OAuth (PKCE)
claude-code-proxy codex auth device    # device-code flow

Sign in with your ChatGPT Plus/Pro account, not an OpenAI API account.

Kimi (kimi.com Kimi Code):

claude-code-proxy kimi auth login      # device-code flow (prints URL + code)

Sign in with your kimi.com account. The verification URL is displayed; open it in any browser, confirm the code, and the CLI polls until done.

On macOS credentials go to Keychain. On Windows they are written under %APPDATA%\claude-code-proxy\<provider>\auth.json

; on Linux they are written under ${XDG_CONFIG_HOME:-$HOME/.config}/claude-code-proxy/<provider>/auth.json

(mode 0600 where supported).

Verify:

claude-code-proxy codex auth status
claude-code-proxy kimi auth status
claude-code-proxy serve                # listens on 127.0.0.1:18765
PORT=11435 claude-code-proxy serve     # change the listen port

Binds to 127.0.0.1

only. One serve

process handles all providers — the upstream for each request is chosen from ANTHROPIC_MODEL

.

ANTHROPIC_MODEL

selects the provider:

gpt-5.5

,gpt-5.4

,gpt-5.3-codex

,gpt-5.3-codex-spark

,gpt-5.4-mini

,gpt-5.2

→codexkimi-for-coding

,kimi-k2.6

,k2.6

→kimi

An unknown model returns a 400 listing the supported ids. There is no implicit default provider.

Claude Code also issues background requests (session title generation, token counts) against its built-in "small/fast" haiku model id. Those requests would 400 because no provider claims it, so set ANTHROPIC_SMALL_FAST_MODEL

to a concrete id too (the same value as ANTHROPIC_MODEL

is usually fine):

ANTHROPIC_BASE_URL=http://localhost:18765 \
ANTHROPIC_AUTH_TOKEN=unused \
ANTHROPIC_MODEL=gpt-5.4[1m] \
ANTHROPIC_SMALL_FAST_MODEL=gpt-5.4-mini[1m] \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK=1 \
  claude

ANTHROPIC_BASE_URL=http://localhost:18765 \
ANTHROPIC_AUTH_TOKEN=unused \
ANTHROPIC_MODEL=kimi-for-coding[1m] \
ANTHROPIC_SMALL_FAST_MODEL=kimi-for-coding[1m] \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK=1 \
  claude

CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK=1

is recommended because the proxy always talks to upstream providers with streaming requests, even when it accumulates a non-streaming Anthropic response for Claude Code. Disabling Claude Code's streaming-to-non-streaming fallback avoids retrying a partially completed stream in a way that can duplicate tool calls.

Or set it persistently in ~/.claude/settings.json

:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://127.0.0.1:18765",
    "ANTHROPIC_AUTH_TOKEN": "unused",
    "ANTHROPIC_MODEL": "gpt-5.4[1m]",
    "ANTHROPIC_SMALL_FAST_MODEL": "gpt-5.4-mini[1m]",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": 1,
    "CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK": 1
  }
}

Claude Code decides auto-compaction based on the model's context window. For unknown models (like the ones the proxy uses) it defaults to 200K tokens, which is smaller than what the upstream models actually support (GPT-5.4: 400K+, Kimi: 256K). This causes auto-compact to fire earlier than necessary.

The [1m]

suffix on the model name (shown in the examples above) is a Claude Code convention that tells it to use a 1M-token context window instead. This raises the auto-compact threshold without disabling it entirely.

If you'd rather disable auto-compact completely, set DISABLE_AUTO_COMPACT=1

in your env or ~/.claude/settings.json

. Manual /compact

still works, but you risk hitting real upstream limits before Claude Code can compact for you.

If you still have an Anthropic subscription you want to fall back to, you can put a small wrapper in front of claude

that only injects the proxy env vars when a flag file exists, plus a toggle script to flip the flag. Leave ~/.claude/settings.json

free of proxy env vars so direct-to-Anthropic remains the default.

~/.local/bin/claude

(ahead of the real claude

on PATH

):

#!/bin/bash

if [ -f "$HOME/.claude/claude-code-proxy-enabled" ]; then
    export ANTHROPIC_BASE_URL="http://localhost:18765"
    export ANTHROPIC_AUTH_TOKEN="unused"
    export ANTHROPIC_MODEL="gpt-5.4[1m]"
    export ANTHROPIC_SMALL_FAST_MODEL="gpt-5.4-mini[1m]"
    export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC="1"
    export CLAUDE_CODE_DISABLE_NONSTREAMING_FALLBACK="1"
fi

exec "$HOME/.local/bin/claude" "$@"

Adjust the exec path if the real claude

binary lives elsewhere on your system (e.g. $(bun pm bin -g)/claude

, $HOME/.claude/local/claude

).

claude-proxy-toggle

(anywhere on your PATH

):

#!/bin/bash
set -euo pipefail

flag="$HOME/.claude/claude-code-proxy-enabled"

if [ -f "$flag" ]; then
    rm "$flag"
    echo "proxy: off"
else
    mkdir -p "$(dirname "$flag")"
    touch "$flag"
    echo "proxy: on"
fi

Run claude-proxy-toggle

to flip between routing through the proxy (Codex / Kimi) and talking to Anthropic directly. New or continued claude

sessions pick up the change immediately; existing sessions keep whatever they started with.

Upstream: https://chatgpt.com/backend-api/codex/responses

(Responses API).

Set ANTHROPIC_MODEL

to a model your ChatGPT subscription is allowed to use. Append -fast

to a Codex model name to request Codex fast mode for that request without restarting the proxy. For example, gpt-5.4-fast[1m]

is sent upstream as model gpt-5.4

with service_tier: "priority"

. An explicit codex.serviceTier

/ CCP_CODEX_SERVICE_TIER

override still takes precedence.

Reasoning effort: Claude Code's output_config.effort

value (the one you see in the UI as ◐ medium · /effort

) is forwarded as Codex reasoning.effort

(low

/ medium

/ high

/ xhigh

). Claude Code's max

value is sent upstream as xhigh

. An explicit codex.effort

/ CCP_CODEX_EFFORT

override still takes precedence and can also force none

.

Confirmed working on Plus:

gpt-5.4

gpt-5.3-codex

Also verified:

gpt-5.2

gpt-5.4-mini

If the resolved model isn't supported by your account, upstream returns a 400 like "The 'gpt-4.1' model is not supported when using Codex with a ChatGPT account."

. The proxy surfaces that verbatim.

Auth:

Command	What it does
`codex auth login`
Browser OAuth (PKCE) via `auth.openai.com`
`codex auth device`
Device-code OAuth for headless machines
`codex auth status`
Show account ID + token expiry
`codex auth logout`
Delete stored credentials

Upstream: https://api.kimi.com/coding/v1/chat/completions

(OpenAI-style chat-completions).

Only one wire model is exposed: kimi-for-coding

(its display name in kimi-cli is Kimi-k2.6, 256k context, supports reasoning + image input + video input). kimi-k2.6

and k2.6

are accepted as aliases for the same wire id.

Reasoning effort: Claude Code's output_config.effort

value (the one you see in the UI as ◐ medium · /effort

) is forwarded as Kimi's reasoning_effort

(low

/ medium

/ high

). Thinking blocks from the upstream model are forwarded to Claude Code and rendered as thinking content. If Claude Code disables thinking, the proxy drops both reasoning_effort

and the thinking: {type: "enabled"}

flag before forwarding.

Auth:

Command	What it does
`kimi auth login`
Device-code OAuth via `auth.kimi.com`
`kimi auth status`
Show user ID + token expiry
`kimi auth logout`
Delete stored credentials

sequenceDiagram
    autonumber
    participant CC as Claude Code
    participant P as claude-code-proxy
    participant AUTH as OAuth host<br/>(auth.openai.com or<br/>auth.kimi.com)
    participant U as Upstream API<br/>(chatgpt.com/codex or<br/>api.kimi.com)

    Note over P,AUTH: One-time: PKCE / device OAuth<br/>tokens cached locally for reuse

    CC->>P: POST /v1/messages (Anthropic shape, stream: true)

    alt access token expiring
        P->>AUTH: POST /oauth/token (refresh_token)
        AUTH-->>P: new access (+ rotated refresh)
    end

    P->>P: translate request<br/>• strip Anthropic-only fields<br/>• system blocks → instructions / system message<br/>• tool_use / tool_result ↔ provider-specific shapes<br/>• prompt_cache_key = session id
    P->>U: POST upstream<br/>Bearer + provider-specific headers
    U-->>P: provider SSE<br/>(Codex: output_item.*, output_text.delta, …)<br/>(Kimi: chat.completion.chunk, reasoning_content, …)
    P->>P: reducer: typed events<br/>(thinking / text / tool start/delta/stop, finish)
    P-->>CC: Anthropic SSE<br/>(message_start, content_block_*, message_delta, message_stop)

Command	Description
`serve`

PORT

codex auth login

/ device

/ status

/ logout

kimi auth login

/ status

/ logout

Starts the HTTP proxy and blocks. Binds to 127.0.0.1

only. Logs to the platform state directory (rotated at 20 MiB). Set CCP_LOG_STDERR=1

to mirror log lines to stderr while running.

claude-code-proxy serve
PORT=11435 claude-code-proxy serve
CCP_LOG_STDERR=1 claude-code-proxy serve

Prints the supported model → provider mapping on startup. One serve

process dispatches to any provider based on the model

field in each request. Requests whose model isn't registered with any provider are rejected with HTTP 400 listing the supported ids.

Runs the PKCE browser flow against auth.openai.com

using the Codex CLI's client ID. Prints a URL, opens a local callback listener on port 1455, waits for the browser to redirect back, and stores the resulting access / refresh tokens in Keychain on macOS or locally on other platforms. The process exits automatically once the tokens are saved.

claude-code-proxy codex auth login

Sign in with your ChatGPT Plus/Pro account, not an OpenAI API account. The token file includes the extracted chatgpt_account_id

so the proxy can set the ChatGPT-Account-Id

header on every upstream call.

Same OAuth flow, but for headless machines. Prints a short user code and a URL; you enter the code from any browser on any other device, and the CLI polls auth.openai.com

until you authorize, then stores the token.

claude-code-proxy codex auth device

Useful over SSH, inside a container, or on any host that can't open a browser.

Shows whether credentials are stored, the account ID, and how long until the access token expires. Non-zero exit if no auth is present.

claude-code-proxy codex auth status

Example output:

Account: 79342a5e-57b7-44ea-bfdc-a83ba070dad6
Expires: 2026-04-28T16:46:04.827Z (in 863946s)
Storage: macOS Keychain

The proxy refreshes the access token 5 minutes before expiry with a single-flight guard, so concurrent requests never trigger stampedes of refresh calls.

Removes stored auth credentials. On macOS this deletes the Keychain entry. No server call is needed; the refresh token just becomes dead.

claude-code-proxy codex auth logout

Run codex auth login

again to re-authenticate.

Runs a device-code OAuth flow (RFC 8628) against auth.kimi.com

using the kimi-cli client ID. Prints a verification URL and a short user code; open the URL in any browser, confirm the code, and the CLI polls until the tokens are issued. Tokens are stored in Keychain on macOS or a mode-0600 file elsewhere.

claude-code-proxy kimi auth login

Sign in with your kimi.com account. The access token has a ~15 minute lifetime; the proxy refreshes it 5 minutes before expiry with a single-flight guard and persists the rotated refresh token.

A persistent device ID is generated on first login next to the Kimi auth file and reused forever — it's bound into the issued JWT, so rotating it would invalidate your token.

claude-code-proxy kimi auth status

Shows the user ID extracted from the token, expiry time, scope, and storage backend. Non-zero exit if no auth is present.

claude-code-proxy kimi auth logout

Removes stored auth credentials (Keychain entry on macOS, file elsewhere). Run kimi auth login

again to re-authenticate.

The proxy speaks enough of the Anthropic API for Claude Code:

POST /v1/messages

: the main turn endpoint (streaming and non-streaming)POST /v1/messages?beta=true

: same (Claude Code always sends?beta=true

)POST /v1/messages/count_tokens

: local token count viagpt-tokenizer

(o200k_base); used by Claude Code's compaction logicGET /healthz

: liveness check

Settings can come from either environment variables or a config.json

file. Precedence per setting: env var > config file > built-in default. The config file is optional — env-var-only setups continue to work unchanged.

The file lives at ~/.config/claude-code-proxy/config.json

on macOS (deliberately not ~/Library

), at %APPDATA%\claude-code-proxy\config.json

on Windows, and at ${XDG_CONFIG_HOME:-$HOME/.config}/claude-code-proxy/config.json

on Linux.

{
  "port": 18765,
  "aliasProvider": "codex",
  "codex": {
    "originator": "claude-code-proxy",
    "userAgent": "claude-code-proxy/dev",
    "model": "gpt-5.4",
    "effort": "medium",
    "serviceTier": "fast",
    "baseUrl": "https://chatgpt.com/backend-api/codex/responses"
  },
  "kimi": {
    "userAgent": "KimiCLI/1.37.0",
    "oauthHost": "https://auth.kimi.com",
    "baseUrl": "https://api.kimi.com/coding/v1"
  },
  "log": {
    "stderr": false,
    "verbose": false
  }
}

Variable	Config key	Default
`PORT`
`port`
`18765`
Proxy listen port
`XDG_STATE_HOME`
—	`~/.local/state`
Linux/macOS base dir for `proxy.log`
`CCP_LOG_STDERR`
`log.stderr`
unset	Also mirror log lines to stderr
`CCP_LOG_VERBOSE`
`log.verbose`
unset	Log full request/response bodies + every SSE event
`CCP_ALIAS_PROVIDER`
`aliasProvider`
`codex`
Route Anthropic-style aliases (`haiku` , `sonnet` , `opus` , `claude-*` ) through `codex` or `kimi`
`CCP_KIMI_OAUTH_HOST`
`kimi.oauthHost`
`https://auth.kimi.com`
Override Kimi's OAuth host (debugging only)
`CCP_KIMI_BASE_URL`
`kimi.baseUrl`
`https://api.kimi.com/coding/v1`
Override Kimi's API base URL
`CCP_CODEX_MODEL`
`codex.model`
unset	Force all Codex requests to this model (`gpt-5.2` , `gpt-5.3-codex` , `gpt-5.3-codex-spark` , `gpt-5.4` , `gpt-5.4-mini` , `gpt-5.5` )
`CCP_CODEX_EFFORT`
`codex.effort`
unset	Force all Codex requests to this reasoning effort (`none` , `low` , `medium` , `high` , `xhigh` )
`CCP_CODEX_SERVICE_TIER`
`codex.serviceTier`
unset	Force all Codex requests to this service tier (`fast` /`priority` , `flex` ; `fast` is sent upstream as `priority` )
`CCP_CODEX_BASE_URL`
`codex.baseUrl`
`https://chatgpt.com/backend-api/codex/responses`
Override the Codex Responses endpoint
`CCP_CODEX_ORIGINATOR`
`codex.originator`
`claude-code-proxy`
Override the `originator` header sent to Codex
`CCP_CODEX_USER_AGENT`
`codex.userAgent`
`claude-code-proxy/<version>`
Override the `User-Agent` header sent to Codex
`CCP_KIMI_USER_AGENT`
`kimi.userAgent`
`KimiCLI/1.37.0`
Override the `User-Agent` header sent to Kimi
`CCP_ORIGINATOR`
—	`claude-code-proxy`
Fallback for `CCP_CODEX_ORIGINATOR`
`CCP_USER_AGENT`
—	unset	Fallback for `CCP_CODEX_USER_AGENT` and `CCP_KIMI_USER_AGENT`

A malformed config.json

is reported on stderr and ignored; defaults are used in its place. Invalid types for individual keys are warned and skipped without affecting other keys.

proxy.log

— JSON-lines log, rotated at 20 MiB. It lives at$XDG_STATE_HOME/claude-code-proxy/proxy.log

on macOS/Linux and at%LOCALAPPDATA%\claude-code-proxy\proxy.log

on Windows (falling back to%USERPROFILE%\AppData\Local

). Secrets (authorization

,access

,refresh

,id_token

,ChatGPT-Account-Id

, …) are redacted before write.config.json

— optional configuration file (see table above). It lives at~/.config/claude-code-proxy/config.json

on macOS,${XDG_CONFIG_HOME:-$HOME/.config}/claude-code-proxy/config.json

on Linux, and%APPDATA%\claude-code-proxy\config.json

on Windows.- Codex tokens — macOS uses Keychain under service claude-code-proxy.codex

. Linux uses${XDG_CONFIG_HOME:-$HOME/.config}/claude-code-proxy/codex/auth.json

. Windows uses%APPDATA%\claude-code-proxy\codex\auth.json

. - Kimi tokens — macOS uses Keychain under service claude-code-proxy.kimi

. Linux uses${XDG_CONFIG_HOME:-$HOME/.config}/claude-code-proxy/kimi/auth.json

. Windows uses%APPDATA%\claude-code-proxy\kimi\auth.json

. - Kimi device ID — persistent UUID bound into the Kimi JWT at login. Linux uses ${XDG_CONFIG_HOME:-$HOME/.config}/claude-code-proxy/kimi/device_id

; Windows uses%APPDATA%\claude-code-proxy\kimi\device_id

. Reused for the lifetime of the install.

Terms of service: using the Codex or Kimi backends from a non-official client is a gray area. Use at your own risk.Rate limits: shared across all clients of your upstream account. Codex'scodex.rate_limits.limit_reached

and Kimi's HTTP 429 are both surfaced as HTTP 429 withretry-after

.Codex — image inputs in tool results: Responses APIfunction_call_output

only takes a string, so image blocks nested insidetool_result

are replaced with a[image omitted: <media_type>]

placeholder. Top-level user-message images pass through.Kimi — image inputs in tool results: pass through asimage_url

parts (Kimi accepts them inrole:"tool"

content).Codex — reasoning blocks: not forwarded to Claude Code (dropped), even if the upstream model produced them.Kimi — reasoning blocks: forwarded as Anthropicthinking

content blocks and rendered by Claude Code. Disable by settingthinking: {"type":"disabled"}

in your Anthropic request.Session title generation: Claude Code's parallel title-gen request is forwarded upstream like any other structured-output request. This costs a handful of tokens per session rather than being stubbed.Codex — translated to Responses APIoutput_config.format

:text.format

(json_schema withstrict: true

); other Anthropic-specificoutput_config

fields are dropped.

bunx tsc --noEmit                          # typecheck
bun src/cli.ts serve                       # run locally (routes all providers)
tail -f ~/.local/state/claude-code-proxy/proxy.log | jq .

Install a compiled dev build globally: compile the current working tree to a binary and place it on your PATH

without linking:

mkdir -p ~/.local/bin
bun build ./src/cli.ts --compile --outfile ~/.local/bin/claude-code-proxy

claude-history: search Claude Code conversation history from the terminalgit-surgeon: non-interactive hunk-level git staging for AI agentsworkmux: manage parallel AI coding tasks in separate git worktrees with tmuxconsult-llm: Consult other AI models from your agent workflow

source & further reading

github.com — original article

Show HN: Use Kimi and OpenAI Subscriptions in Claude Code

Run your AI side-project on zahid.host