Building ccglass: the architecture of a local LLM reverse proxy

wpnews.pro

cd /news/developer-tools/building-ccglass-the-architecture-of… · home › topics › developer-tools › article

[ARTICLE · art-30414] src=dev.to ↗ pub=2026-06-17T02:14Z topic=developer-tools verified=true sentiment=· neutral

Building ccglass: the architecture of a local LLM reverse proxy

A developer built ccglass, an open-source local reverse proxy that captures LLM API traffic from coding agent CLIs and displays a real-time dashboard of prompts, costs, and cache hit rates. The proxy intercepts local loopback traffic by overriding the API base URL to plain HTTP, avoiding TLS interception issues. It also includes an MCP server that allows Claude Code to query its own request history from within the chat.

read2 min views25 publishedJun 17, 2026

ccglass is a local reverse proxy that captures LLM API traffic from coding agent CLIs (Claude Code, Codex, DeepSeek, Kimi, etc.) and shows you a real-time dashboard of prompts, costs, and cache hit rates.

It's open source. It's 5,000 lines of Node. It's MIT licensed.

GitHub: https://github.com/jianshuo/ccglass

The hardest part wasn't building a proxy. It was making it work with coding agent CLIs that deliberately bypass HTTP_PROXY.

Every native CLI (Claude Code is Node, Codex is Node, DeepSeek's CLI is Go, etc.) opens HTTPS sockets directly. They don't honor HTTP_PROXY

env vars. So the standard "man-in-the-middle" pattern (mitmproxy, Charles) doesn't apply — these tools need a CA cert to intercept HTTPS, but the CLI isn't going to trust your CA.

The trick: intercept the local loopback hop, not the wire.

The CLI's API base URL is https://api.anthropic.com

. We override it to http://127.0.0.1:8123

. Now the local hop is plain HTTP — no cert, no interception, no TLS. The CLI's Node https

module makes a request to http://127.0.0.1:8123

, which our proxy receives, logs, and forwards to the real https://api.anthropic.com

┌─────────────┐   plain HTTP    ┌─────────────┐    HTTPS    ┌─────────────┐
│  Claude     │ ──────────────▶ │  ccglass    │ ──────────▶ │ Anthropic   │
│  Code CLI   │  127.0.0.1:8123 │  proxy      │             │ API         │
└─────────────┘                 └─────────────┘             └─────────────┘
                                       │
                                       │ log + dashboard
                                       ▼
                                ┌─────────────┐
                                │  Browser    │
                                │  UI :8123   │
                                └─────────────┘

3 components:

*_BASE_URL

env vars, spawns the CLI as a child processThe trickiest part: LLM APIs use Server-Sent Events (SSE) for streaming. The CLI expects an openai-sse

or anthropic-sse

stream. We need to:

In Node, this is pipeline()

with a Transform

stream that hashes each chunk and writes it to a side channel. The CLI gets the original stream unchanged.

Each provider has a different pricing model. Cache hits, prompt caching, batch API, all change the math.

I extracted pricing into a JSON file (data/pricing.json

) keyed by provider:model

and updated monthly. The cost is computed during the response stream so you see cost accumulating in real time on the dashboard.

The wild feature: ccglass has its own MCP (Model Context Protocol) server. When Claude Code starts, it can call our MCP tools. One of them is get_recent_requests

— Claude can query its own request history from inside the chat.

User: what did I prompt you with 3 turns ago?
Claude: [calls ccglass MCP get_recent_requests]
Claude: You prompted me with "refactor the user service to use the new repository pattern".

It's recursive and weird. I love it.

npm i -g ccglass
ccglass claude

Open the dashboard. Run a few prompts. The first time you see your own cache hit rate, you'll get it.

source & further reading

dev.to — original article Publishers Blocking AI Crawlers Are Reshaping the Economics of Training Data Clive — a friendly CLI for local LLMs I handed AI agents almost the whole product. Except one part - and that part is the job

~/api · this article 200

$curl api.wpnews.pro/v1/news/building-ccglass-the-arc…

Read original on dev.to → dev.to/houleixx/building-ccglass-the-architectur…

mentioned entities

ccglass

Claude Code

Codex

DeepSeek

Kimi

Anthropic

Node

MCP

metadata

slugbuilding-ccglass-the-architecture-of-a-local-llm-reverse-proxy

topic#developer-tools

secondary3 topics

sentimentneutral

canonicaldev.to

navigation

← prevFlax debugging: making a hash of…

next →Corsair RTX 5090 gaming PC drops…

── more in #developer-tools 4 stories · sorted by recency

episko.dev · 1 Aug · #developer-tools

Show HN: Cockpit for you Claude Code agents in Rust

startupfortune.com · 1 Aug · #developer-tools

Supabase Open-Sources Evals to Grade Claude Code, Codex and OpenCode

dev.to · 1 Aug · #developer-tools

Same DeepSeek V4 Flash, Different Agent: Why the Runtime Changes the Result

dev.to · 1 Aug · #developer-tools

The Ice Cream Stands in the Middle of the Beach: why rational AI startups fail together

── more on @ccglass 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 1 Aug · #ai-agents

Quality Isn't Accidental — Maker/Checker Separation and Automated Validation

wpnews · 1 Aug · #developer-tools

I Built a Portable AI Skill That Safely Upgrades .NET Applications

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required