cd /news/developer-tools/building-ccglass-the-architecture-of… Β· home β€Ί topics β€Ί developer-tools β€Ί article
[ARTICLE Β· art-30414] src=dev.to β†— pub= topic=developer-tools verified=true sentiment=Β· neutral

Building ccglass: the architecture of a local LLM reverse proxy

A developer built ccglass, an open-source local reverse proxy that captures LLM API traffic from coding agent CLIs and displays a real-time dashboard of prompts, costs, and cache hit rates. The proxy intercepts local loopback traffic by overriding the API base URL to plain HTTP, avoiding TLS interception issues. It also includes an MCP server that allows Claude Code to query its own request history from within the chat.

read2 min views1 publishedJun 17, 2026

ccglass is a local reverse proxy that captures LLM API traffic from coding agent CLIs (Claude Code, Codex, DeepSeek, Kimi, etc.) and shows you a real-time dashboard of prompts, costs, and cache hit rates.

It's open source. It's 5,000 lines of Node. It's MIT licensed.

GitHub: https://github.com/jianshuo/ccglass

The hardest part wasn't building a proxy. It was making it work with coding agent CLIs that deliberately bypass HTTP_PROXY.

Every native CLI (Claude Code is Node, Codex is Node, DeepSeek's CLI is Go, etc.) opens HTTPS sockets directly. They don't honor HTTP_PROXY

env vars. So the standard "man-in-the-middle" pattern (mitmproxy, Charles) doesn't apply β€” these tools need a CA cert to intercept HTTPS, but the CLI isn't going to trust your CA.

The trick: intercept the local loopback hop, not the wire.

The CLI's API base URL is https://api.anthropic.com

. We override it to http://127.0.0.1:8123

. Now the local hop is plain HTTP β€” no cert, no interception, no TLS. The CLI's Node https

module makes a request to http://127.0.0.1:8123

, which our proxy receives, logs, and forwards to the real https://api.anthropic.com

.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   plain HTTP    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    HTTPS    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Claude     β”‚ ──────────────▢ β”‚  ccglass    β”‚ ──────────▢ β”‚ Anthropic   β”‚
β”‚  Code CLI   β”‚  127.0.0.1:8123 β”‚  proxy      β”‚             β”‚ API         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                       β”‚
                                       β”‚ log + dashboard
                                       β–Ό
                                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                β”‚  Browser    β”‚
                                β”‚  UI :8123   β”‚
                                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

3 components:

*_BASE_URL

env vars, spawns the CLI as a child processThe trickiest part: LLM APIs use Server-Sent Events (SSE) for streaming. The CLI expects an openai-sse

or anthropic-sse

stream. We need to:

In Node, this is pipeline()

with a Transform

stream that hashes each chunk and writes it to a side channel. The CLI gets the original stream unchanged.

Each provider has a different pricing model. Cache hits, prompt caching, batch API, all change the math.

I extracted pricing into a JSON file (data/pricing.json

) keyed by provider:model

and updated monthly. The cost is computed during the response stream so you see cost accumulating in real time on the dashboard.

The wild feature: ccglass has its own MCP (Model Context Protocol) server. When Claude Code starts, it can call our MCP tools. One of them is get_recent_requests

β€” Claude can query its own request history from inside the chat.

User: what did I prompt you with 3 turns ago?
Claude: [calls ccglass MCP get_recent_requests]
Claude: You prompted me with "refactor the user service to use the new repository pattern".

It's recursive and weird. I love it.

npm i -g ccglass
ccglass claude

Open the dashboard. Run a few prompts. The first time you see your own cache hit rate, you'll get it.

── more in #developer-tools 4 stories Β· sorted by recency
github.com Β· Β· #developer-tools
Headroom
── more on @ccglass 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/building-ccglass-the…] indexed:0 read:2min 2026-06-17 Β· β€”