# LLM Proxy Projects with Granular API Key Access Control: A Comprehensive Survey

> Source: <https://deepresearch.ninja/2026/06/LLM-Proxy-Projects-with-Granular-API-Key-Access-Control-A-Comprehensive-Survey/>
> Published: 2026-06-10 00:00:00+00:00

Post

# LLM Proxy Projects with Granular API Key Access Control: A Comprehensive Survey

A research report examining 38+ open-source and managed LLM proxy/gateway projects that provide API key management and granular access control — from LiteLLM and Bifrost to TensorZero and TrueFoundry, comparing their architectures, features, and trade-offs.

## Executive Summary

This report identifies and analyzes **38+ open-source and managed LLM proxy/gateway projects** that provide API key management and granular access control capabilities. The market has matured rapidly from simple protocol-translating proxies into comprehensive governance platforms. The leading project, LiteLLM (~40K GitHub stars), offers the most complete feature set with virtual keys scoped per user, team, and budget — but faces performance limitations (Python GIL) and a supply-chain security breach in March 2026. Newer projects like Bifrost (5.6K stars, Go-based) claim 54× lower latency and deliver enterprise governance features in their open-source tier without paywalling RBAC or SSO.

A significant expansion from the prior survey includes several notable new entrants: **TensorZero** (~11.4K stars, Rust-based, <1ms P99 latency) brings an ML-optimized gateway with tag-based rate limits; **TrueFoundry** offers enterprise-grade virtual accounts, RBAC, policy-as-code (Cedar/OPA), and SOC 2/HIPAA/ITAR compliance with on-prem deployment; **RelayPlane** (npm-native, Node.js) provides a local-first cost-intelligence proxy with MCP support and zero network-hop overhead; **OpenZiti LLM Gateway** (Go, Apache 2.0) delivers virtual API keys with model-level glob restrictions and zero-trust networking via zrok overlay. Additionally, managed-only platforms like **Kilo Gateway** (500+ models, BYOK), **nexos.ai** (founded by Nord Security creators, $35M funding), and **Braintrust Gateway** (organization-scoped and project-scoped API keys with AES-GCM caching) are tracked separately from self-hostable projects.

The landscape spans four architectural categories: Python libraries (LiteLLM, Portkey, LM-Proxy), Go binaries (Bifrost, Instawork llm-proxy, VoidLLM, OpenZiti), TypeScript services (Helicone, LLM Gateway, OmniRoute), Rust gateways (TensorZero, Helicone’s Rust rewrite), and enterprise-grade platforms built on Envoy (WSO2, Kong). A significant trend is the emergence of MCP (Model Context Protocol) gateway capabilities as a new governance frontier — Bifrost, Portkey, VoidLLM, WSO2, RelayPlane, and TrueFoundry all now offer MCP tool management with access control.

Critically, the market is bifurcating between **self-hostable open-source proxies** (LiteLLM, Bifrost, TensorZero, VoidLLM, etc.) and **managed SaaS-only platforms** (OpenRouter, Kilo Gateway, nexos.ai, Cloudflare AI Gateway). This distinction has direct implications for data sovereignty, compliance, and operational burden — a theme explored throughout this report.

### Key Findings on API Key Granularity

The deepest analysis in this report examines *how granular* access control actually is across projects:

**Per-model scoping**: LiteLLM, Bifrost, TensorZero, OpenZiti, LLM Security Gateway, and TrueFoundry all support restricting keys to specific models or model patterns.**IP allowlisting**: Only Portkey (geography/IP inbound rules) and lazy-llm-proxy (per-key IP allowlists) provide this natively. Most projects rely on network-level controls (VPC, firewall).**Webhook-based validation**: LM-Proxy (Nayjest) supports external HTTP service or Python function for custom key validation.** Multi-tenancy isolation**: Bifrost (hierarchical org CRUD), TrueFoundry (Virtual Accounts, per-provider RBAC), and LiteLLM (team_id with PostgreSQL-backed logical separation) lead in multi-tenant depth. Physical/infra-level isolation requires self-hosting on-prem/VPC (TrueFoundry, WSO2).

## 1. Background and Context

### 1.1 What is an LLM Proxy?

An LLM proxy (also called an AI gateway, LLM router, or LLM middleware) sits between client applications and LLM inference providers. Its core functions are:

**Protocol normalization**: Translating between different provider API formats (OpenAI’s`/v1/chat/completions`

, Anthropic’s`/v1/messages`

, Google’s Vertex AI, AWS Bedrock’s SigV4-signed requests) into a single unified interface**Access control**: Managing which clients/teams/users can access which models and with what limits** Cost governance**: Tracking token usage, enforcing budgets, and providing cost attribution** Reliability**: Automatic failover, retry, load balancing, and circuit breaking** Observability**: Logging, metrics, tracing

### 1.2 Why API Key Granularity Matters

In production environments, a single shared provider API key (e.g., one OpenAI `sk-...`

key used across all services) creates several problems:

**No cost attribution**: You cannot determine which team or service drove spending** No isolation**: A runaway script in one service burns the entire budget** No audit trail**: You cannot attribute a specific request to a user or application** Security risk**: If one key is compromised, all services are exposed** No rate management**: You cannot enforce per-service or per-user limits

Virtual API keys solve this by providing a layer of indirection — the proxy accepts client-specific keys and maps them to upstream provider credentials internally.

### 1.3 Market Context: The LiteLLM Breach

In March 2026, two significant events shook confidence in the LLM proxy ecosystem:

- A
**supply-chain attack** via compromised PyPI versions (1.82.7, 1.82.8) of LiteLLM that deployed a credential-stealing payload through a poisoned`trivy-action`

GitHub Action [1] - The revelation that LiteLLM’s SOC 2 certification was based on
**fabricated compliance reports** from Delve, a YC-backed startup later exposed for producing 500+ structurally identical audit reports [2]

These events accelerated evaluation of alternatives with better security postures (compiled languages, auditable builds) and verified compliance credentials.

## 2. Detailed Project Analysis

### 2.1 LiteLLM (BerriAI) — The Incumbent

| Attribute | Value |
|---|---|
| GitHub | github.com/BerriAI/litellm |
| Stars | ~40,000 |
| Language | Python |
| License | MIT |
| Self-hosted | Yes (Docker, PyPI) |
| Providers | 100+ |

**API Key / Access Control Features:**

**Virtual keys**: Create verification tokens that act as client-facing API keys. Each key can be scoped to a specific user or team** Budget management**: Per-key personal budgets, per-team shared budgets, and per-team-member individual limits within a team’s shared budget [3]**Rate limiting**: Configurable RPM (requests per minute) and TPM (tokens per minute) per key** Admin UI**: Web dashboard for managing models, keys, teams, and budgets** Team management**: Keys can be assigned to teams with`team_id`

, enabling hierarchical budget structures**Cost tracking**: Automatic mapping of model-specific token pricing; cost data exposed at key, user, and team level [4]** Guardrails**: Input/output content filtering (enterprise tier)** Load balancing**: Distribute across multiple deployments of the same model

**Limitations:**

- Python GIL limits concurrency under high traffic
- PostgreSQL-backed logging degrades after ~1M records
- Enterprise features (SSO, RBAC, team budgets) gated behind paid license
- Supply-chain vulnerability in March 2026

### 2.2 Bifrost (Maxim AI) — The High-Performance Challenger

| Attribute | Value |
|---|---|
| GitHub | github.com/maximhq/bifrost |
| Stars | ~5,600 |
| Language | Go (75%), TypeScript (17%) |
| License | Apache 2.0 |
| Self-hosted | Yes (Docker, npx, Helm) |
| Providers | 23+ |

**API Key / Access Control Features:**

**Virtual Keys**: Create separate keys for different applications with independent budgets, rate limits, and access controls [5]** Hierarchical Budgets**: Four-tier hierarchy — Customer → Team → Virtual Key → Provider Config. Per-key rate limits, model restrictions, and spend caps enforced at the proxy layer**SSO Integration**: Google and GitHub SSO available in open-source version (not paywalled)** RBAC**: Role-based access control for admin, team, and user roles** Vault Support**: HashiCorp Vault integration for secure API key management** OIDC User Provisioning**: OAuth 2.0 / OIDC login with background directory sync for teams, roles, and business units [6]** Per-provider budgets**: Budgets scoped per virtual-key top level and per provider, wired from model configs** Business unit CRUD**: Full create/read/update/delete for organizational units

**Performance:**

- 11µs overhead at 5,000 RPS (vs. LiteLLM’s ~8ms P95 at 1,000 RPS)
- 54× faster at P99 latency

### 2.3 Portkey AI Gateway

| Attribute | Value |
|---|---|
| GitHub | github.com/Portkey-AI/gateway |
| Stars | ~12,000 |
| Language | TypeScript (96%) |
| License | MIT (open-source gateway) |
| Self-hosted | Yes (Docker, Node.js, Cloudflare Workers) |
| Providers | 1,600+ |

**API Key / Access Control Features:**

**Virtual Keys**: On-the-fly virtual key generation for secure key management** Role-Based Access Control (RBAC)**: Granular access control for users, workspaces, and API keys [7]** Secure Key Vault**: Store LLM provider keys in Portkey’s vault; manage access with virtual keys** Secret References**: Reference secrets stored in AWS Secrets Manager, Azure Key Vault, or HashiCorp Vault — the gateway fetches credentials at runtime without storing them [8]**MCP Gateway**: Centralized control plane for MCP servers with authentication, access control (per-team and per-user), identity forwarding (email, team, roles), and observability**Access Control & Inbound Rules**: Control which IPs and geographies can connect to deployments** PII Redaction**: Automatically remove sensitive data from requests

**Enterprise Features:**

- SOC2, HIPAA, GDPR, CCPA compliance
- Professional support with feature prioritization

### 2.4 Helicone

| Attribute | Value |
|---|---|
| GitHub | github.com/Helicone/helicone |
| Stars | ~5,800 |
| Language | TypeScript (91%) |
| License | Apache 2.0 |
| Self-hosted | Yes (Docker, Helm) |
| Providers | 100+ |

**API Key / Access Control Features:**

**AI Gateway**: Single endpoint (`https://ai-gateway.helicone.ai`

) accepting a Helicone API key, routing to 100+ models**Unified API Key**: One key provides access across all providers** Self-hosted deployment**: Docker Compose and Helm charts available** Observability-first**: Request logging, cost tracking, latency monitoring — the proxy is primarily an observability layer with light gateway features

**Limitations:**

- Lighter on routing and governance compared to full-featured gateways
- Self-hosting noted as “not recommended” for manual deployment
- Enterprise features (compliance, governance) remain thinner than enterprise-focused alternatives

### 2.5 VoidLLM

| Attribute | Value |
|---|---|
| GitHub | github.com/voidmind-io/voidllm |
| Stars | ~104 |
| Language | Go (80%) |
| License | BSL 1.1 (source-available) |
| Self-hosted | Yes (Docker, Helm, binary) |
| Providers | OpenAI, Anthropic, Azure, Ollama, vLLM, custom |

**API Key / Access Control Features:**

**Virtual Keys**: Organization-wide access control with org/team/user scoping and RBAC [9]** RBAC Hierarchy**: Org → Team → User → Key hierarchy with 4 roles** Rate Limits**: Per-key, per-team, per-org request limits (RPM/RPD), most-restrictive-wins across levels** Token Budgets**: Daily/monthly token budgets with real-time enforcement** Usage Tracking**: Tokens, cost, duration, TTFT (time-to-first-token) per request** Model Aliases**: Clients call`default`

, proxy routes anywhere — decouples client code from provider**MCP Gateway**: Proxy external MCP servers with access control and session management; Code Mode (WASM-sandboxed JS) for multi-tool orchestration**Zero-Knowledge Architecture**: By design, never stores or logs prompt/response content. Only metadata (who, what model, how many tokens) is tracked

**Pricing Model:**

- Free tier: Core features
- Pro ($49/mo): Cost reports, usage export, extended retention
- Enterprise ($149/mo): SSO/OIDC, per-org SSO, auto-provisioning, audit logs, OpenTelemetry

### 2.6 OmniRoute

| Attribute | Value |
|---|---|
| GitHub | github.com/diegosouzapw/OmniRoute |
| Stars | ~5,700 |
| Language | TypeScript (100%) |
| License | MIT |
| Self-hosted | Yes (Docker, npm) |
| Providers | 160+ |

**API Key / Access Control Features:**

**Dedicated API Key Manager**:`/dashboard/api-manager`

page for managing API keys with create, delete, and permissions management [10]**Encryption at Rest**: Credentials encrypted with AES-256-GCM** Authentication Methods**: OAuth, API Key, or Web Cookie** Free Providers**: 50+ providers with free tiers aggregated** Multi-modal APIs**: Text, image, audio support

**Key Differentiator:**

- Completely free and open-source with no cloud dependency — “No OmniRoute cloud sits in the request path”

### 2.7 LLM-API-Key-Proxy (Mirrowel)

| Attribute | Value |
|---|---|
| GitHub | github.com/Mirrowel/LLM-API-Key-Proxy |
| Stars | ~507 |
| Language | Python (100%) |
| License | MIT (proxy) + LGPL-3.0 (resilience library) |
| Self-hosted | Yes (Docker, binary, source) |
| Providers | Via LiteLLM fallback |

**API Key / Access Control Features:**

**Single PROXY_API_KEY**: One API key for all clients; configured via environment variable or TUI** Multi-provider Key Rotation**: Automatic rotation across multiple provider keys with intelligent cooldowns** Usage Tracking**: Per-provider usage statistics persisted to disk** Quota Viewer**: Alpha feature for viewing quota windows and fair-cycle status** Credential Management**: Interactive TUI for managing API keys and OAuth credentials

**Architecture:**

- Two components: FastAPI proxy application + standalone Python resilience library (
`rotator_library`

) - The resilience library is independently usable for intelligent key selection, deadline-driven requests, and automatic failover

### 2.8 Instawork llm-proxy

| Attribute | Value |
|---|---|
| GitHub | github.com/Instawork/llm-proxy |
| Stars | ~31 |
| Language | Go (96%) |
| License | MIT |
| Self-hosted | Yes (Docker, binary) |
| Providers | OpenAI, Anthropic, Gemini, AWS Bedrock |

**API Key / Access Control Features:**

**Per-user/API Key Rate Limiting**: Experimental feature for request/token-based limits per user/API key/model/provider** Token Estimation**: Provisional token estimation with post-response reconciliation using`X-LLM-Input-Tokens`

**Circuit Breaker**: Per-key circuit breaker that classifies upstream failures, retries transient errors, and emits a degraded-signal response**Per-provider Rollup**: Detects wholesale outages across multiple keys for the same provider** Bypass Safety Valve**: Callers without fallback can opt out of fast-fail via header

**Key Differentiator:**

- Minimalist design — “without all the extra stuff you don’t need”
- AWS Bedrock transparent SigV4 passthrough (clients sign with their own credentials)
- Comprehensive circuit breaker with per-model keying and provider rollup

### 2.9 LLM Gateway (theopenco)

| Attribute | Value |
|---|---|
| GitHub | github.com/theopenco/llmgateway |
| Stars | ~1,300 |
| Language | TypeScript (95%) |
| License | AGPLv3 (open-source) + Enterprise |
| Self-hosted | Yes (Docker, unified container) |
| Providers | 210+ |

**API Key / Access Control Features:**

**API Key Management**: Unified API interface with authentication** Usage Analytics**: Track requests, tokens used, response times, and costs** Team and Organization Management**: Enterprise features (paid)** Custom Provider Key Configurations**: Enterprise tier

**Architecture:**

- Monorepo with separate apps: UI (Next.js), API (Hono), Gateway (routing), Playground, Admin
- PostgreSQL + Redis for data persistence

### 2.10 llm-budget-proxy (InkByteStudio)

| Attribute | Value |
|---|---|
| GitHub | github.com/InkByteStudio/llm-budget-proxy |
| Stars | ~0 |
| Language | TypeScript (89%) |
| License | MIT |
| Self-hosted | Yes (Docker) |
| Providers | OpenAI only (MVP) |

**API Key / Access Control Features:**

**Per-Key Token Budgets**: Daily and monthly USD budgets per API key** Rate Limiting**: RPM and TPM limits with overrides by key pattern** Model Downgrade**: Automatic downgrade to cheaper models when approaching budget thresholds (opt-in)** Cost Dashboards**: Single-page Chart.js dashboard showing cost by key, over time, and budget status** Alert Webhooks**: Slack/Discord webhook notifications at 80% (warn), 95% (downgrade), 100% (block)** Response Headers**: Every response includes`X-Request-Cost`

,`X-Estimated-Cost`

,`X-Budget-Remaining`

,`X-Budget-Warning`

**Key Differentiator:**

- Deliberately simpler than LiteLLM: single SQLite database, single Docker container, ~5-minute setup vs. ~30 minutes for LiteLLM

### 2.11 LM-Proxy (Nayjest)

| Attribute | Value |
|---|---|
| GitHub | github.com/Nayjest/lm-proxy |
| Stars | ~134 |
| Language | Python (99%) |
| License | MIT |
| Self-hosted | Yes (pip, source) |
| Providers | OpenAI, Anthropic, Google AI, local PyTorch |

**API Key / Access Control Features:**

**Virtual API Key Management**: Proxy-level keys separate from upstream provider keys** User Groups**: Configurable groups with`api_keys`

lists and`allowed_connections`

restrictions**OIDC Integration**: Validate tokens from OpenID Connect providers (Keycloak, Auth0, Okta) as virtual API keys** Custom API Key Validation**: Extensible validator functions for custom authentication logic** Rate Limiter Handler**: Sliding window rate limiting scoped per api_key, ip, connection, group, or global** Extensible Middleware**: Before/request handlers for auditing, header forwarding, and custom logic

**Configuration:**

- TOML/YAML/JSON/Python config files
`api_key_check`

can reference a Python function or external HTTP service

### 2.12 LLM Security Gateway (TerminalsandCoffee)

| Attribute | Value |
|---|---|
| GitHub | github.com/TerminalsandCoffee/llm-security-gateway |
| Stars | ~1 |
| Language | Python (93%) |
| License | Not specified |
| Self-hosted | Yes (Docker, AWS Lambda) |
| Providers | OpenAI, AWS Bedrock |

**API Key / Access Control Features:**

**Per-Client Authentication**:`X-API-Key`

header with constant-time comparison (`hmac.compare_digest`

)**Per-Client Configuration**: JSON file or DynamoDB backend for per-client settings** Rate Limiting**: Sliding window counter, per-client RPM, returns`X-RateLimit-*`

headers**Model Allowlist**: Per-client model restrictions (empty = all allowed)** AWS Lambda Deployment**: Terraform-managed infrastructure with CI/CD

**Security Pipeline:**

- Authentication → 2. Rate Limiting → 3. Model Allowlist → 4. Injection Detection (20 patterns, 4 categories) → 5. PII Detection (SSN, CC, email, phone, IPv4) → 6. Forward → 7. Response Scan

### 2.13 Paperclip (paperclipai)

| Attribute | Value |
|---|---|
| GitHub | github.com/paperclipai/paperclip |
| Stars | ~69,700 |
| Language | TypeScript (98%) |
| License | MIT |
| Self-hosted | Yes (npx, Docker) |
| Scope | AI agent orchestration (not LLM proxy) |

**API Key / Access Control Features:**

**Agent API Keys**: Short-lived run JWTs for agent execution** Per-Agent Monthly Budgets**: Token and cost tracking by company, agent, project, goal, issue, provider, and model** Budget Hard Stops**: Overspend pauses agents and cancels queued work** Org Chart Governance**: Board approval workflows, execution policies with review/approval stages** Multi-Company Isolation**: Complete data isolation between organizations

**Note:** Paperclip is not an LLM proxy — it’s an orchestration/control plane for AI agents. It manages who works on what and how spend is capped, but delegates the actual LLM routing to underlying tools (Claude Code, Codex, HTTP adapters).

### 2.14 WSO2 AI Gateway

| Attribute | Value |
|---|---|
| GitHub | github.com/wso2/wso2-envoy-ai-gateway |
| Language | Go (89%) |
| License | Apache 2.0 |
| Self-hosted | Yes (Docker, Kubernetes) |
| Providers | OpenAI, Anthropic, Google Vertex, Azure AI, AWS Bedrock, Mistral |

**API Key / Access Control Features:**

**Token-Based Rate Limiting**: Calibrated to how LLMs actually charge (per-token, not per-request)** MCP Governance**: Convert REST APIs into MCP-compatible servers, proxy external MCP servers with centralized policy enforcement** PII Masking**: Scrub sensitive data before prompts leave the network** SOC 2 Type 2 + ISO 27001**: Verified compliance credentials

**Key Differentiator:**

- Built on Envoy Proxy — established Kubernetes-native deployment patterns
- Unbundled adoption: can deploy just the AI Gateway without the full platform

### 2.15 Kong AI Gateway

| Attribute | Value |
|---|---|
| Platform | Kong API Management Platform |
| Language | Go core + Lua plugins |
| License | Partial open-source (core) |
| Self-hosted | Yes |
| Providers | Via plugin architecture |

**API Key / Access Control Features:**

**Token-Based Rate Limiting**: Enterprise tier** RBAC & Audit Logs**: Enterprise API governance** AI MCP Proxy Plugin**: Dedicated MCP traffic governance** OAuth2 Plugins**: For MCP authentication

**Key Differentiator:**

- Existing enterprise API management penetration — natural adoption path for teams already running Kong

### 2.16 TensorZero — The ML-Optimized Rust Gateway

| Attribute | Value |
|---|---|
| GitHub | github.com/tensorzero/tensorzero |
| Stars | ~11,400 |
| Language | Rust |
| License | Apache 2.0 |
| Self-hosted | Yes (Docker, K8s/Helm examples) |
| Providers | 15+ direct, any OpenAI-compatible via extension |

**API Key / Access Control Features:**

**Custom API Keys**: Create and manage custom API keys for different clients or services [tensorzero.com/docs/operations/set-up-auth-for-tensorzero]**Tag-based rate limits**: Granular scopes — rate limits apply per tag (e.g., per-project, per-team, per-environment)** Usage/cost tracking**: Per-tag attribution for cost and usage analytics** Structured inference with schema validation**: Enforces input/output schemas, data used for downstream optimization** GitOps orchestration**: Prompts, models, parameters, tools, experiments managed via version-controlled config

**Performance:**

- <1ms P99 latency at 10,000+ QPS (Rust)
- LiteLLM @ 100 QPS adds 25-100x more latency than TensorZero @ 10,000 QPS [tensorzero.com/docs/gateway]

**Differentiator:**

- Combines gateway + observability + evaluation + optimization + A/B testing in one platform
- “Autopilot” feature: automated AI engineer that analyzes observability data, sets up evals, optimizes prompts/models, runs A/B tests
- Team: includes Rust compiler maintainer, J.P. Morgan AI Research VP, Columbia postdoc

### 2.17 TrueFoundry AI Gateway — Enterprise Control Plane

| Attribute | Value |
|---|---|
| URL | truefoundry.com/ai-gateway |
| Stars | N/A (enterprise SaaS + self-hosted) |
| Language | Go (internal) |
| License | Proprietary (self-hosted available) |
| Self-hosted | Yes (SaaS, VPC, on-prem, air-gapped) |
| Providers | 250+ models |

**API Key / Access Control Features:**

**Virtual Accounts (VAT)**: Non-human production identity — gateway-managed keys that map to real provider credentials centrally [truefoundry.com/blog/ai-governance-audit-enterprise-llm-gateway]**Personal Access Tokens (PATs)**: For development workflows** RBAC**: Scoped per provider account; policy-as-code with Cedar and OPA engines at MCP-tool boundary** Rate-limit & budget rules**: Expressed as YAML with per-user/per-team/per-model/per-metadata scopes; first-match-wins evaluation** Sliding-window enforcement**: Twelve 5-second buckets summed across 60-second window; bursty but strict [truefoundry.com/docs/ai-gateway/ratelimiting]**Audit-grade traces**: Every request, rate-limit decision, guardrail outcome, and fallback hop lands on the same trace ID (`x-tfy-trace-id`

), exportable via OpenTelemetry to SIEM**Data residency routing**: Region-aware routing keeps regulated data within jurisdiction; provider restrictions block data classes from certain providers

**Compliance:**

- SOC 2 Type 2, HIPAA, ITAR certified
- Recognized in Gartner Market Guide for AI Gateways 2026 [truefoundry.com/ai-gateway]

**Differentiator:**

- Most complete governance feature set: virtual keys + RBAC + policy-as-code + compliance-grade audit logs + residency routing
- Deployment flexibility: SaaS, VPC, on-prem, air-gapped
- GPU orchestration and fractional GPU support built in

### 2.18 RelayPlane — The npm-Native Cost-Intelligence Proxy

| Attribute | Value |
|---|---|
| GitHub | github.com/RelayPlane/proxy |
| Stars | ~200 |
| Language | Node.js / TypeScript |
| License | MIT |
| Self-hosted | Yes (npm install -g) |
| Providers | 11+ providers + Ollama |

**API Key / Access Control Features:**

**Not a virtual-key system**: Uses your own provider keys directly (Anthropic, OpenAI, Google, xAI, Moonshot)** Cost intelligence proxy**: Classifies tasks using heuristics (token count, prompt patterns, keyword matching) and routes to cheapest capable model**Budget enforcement**: In free tier; configurable cascade fallback when models hit limits** Dashboard**: Tracks every request, shows where money goes

**Key Differentiator:**

- npm-native:
`npm install -g @relayplane/proxy`

— 30 seconds, no Docker, no Python env, no Go toolchain - Local-first architecture: runs in-process with your app; zero network-hop overhead
- MCP server support shipped in v1.0.0
- Designed for Claude Code / Cursor / coding agent workflows

### 2.19 OpenZiti LLM Gateway — Zero-Trust Proxy

| Attribute | Value |
|---|---|
| GitHub | github.com/openziti/llm-gateway |
| Stars | ~65 |
| Language | Go (100%) |
| License | Apache 2.0 |
| Self-hosted | Yes (single binary, no DB) |
| Providers | OpenAI, Anthropic, Ollama, vLLM, llama-server, SGLang |

**API Key / Access Control Features:**

**Virtual API Keys**: Generated via`llm-gateway genkey`

; stored in config YAML**Model-level restrictions**: Keys can be restricted to specific models using glob patterns (`allowed_models: ["*"]`

or`"gpt-4o"`

) [github.com/openziti/llm-gateway]**Client authentication**:`Authorization: Bearer <key>`

header;`/health`

and`/metrics`

endpoints remain unauthenticated

**Unique Features:**

**Zero-trust networking via zrok/OpenZiti overlay**: Expose gateway or reach backends across NAT, air-gapped networks, or cloud boundaries without firewall rules** Semantic routing**: Three-layer cascade — keyword heuristics → embedding similarity → LLM classifier — to automatically select best model when`model`

field is omitted**Multi-endpoint load balancing**: Weighted round-robin with health checks and passive failover across inference backends

**Architecture:**

- Single binary, zero infrastructure — one YAML config, no database, no message queue, no sidecar
- Prometheus metrics endpoint for request counts, latency histograms, token counters

### 2.20 lazy-llm-proxy (Xu-pixel) — Per-Key Security Proxy

| Attribute | Value |
|---|---|
| GitHub | github.com/Xu-pixel/lazy-llm-proxy |
| Stars | ~100 |
| Language | TypeScript |
| License | MIT |
| Self-hosted | Yes (npm) |

**API Key / Access Control Features:**

**Per-key token budgets**: Daily/monthly limits per API key** IP allowlists**: Per-key IP whitelisting — rare among LLM proxies** Model allowlists**: Per-key model restrictions** System prompts**: Optional forced or non-forced system prompts per key** Usage breakdowns**: Per-key and per-provider usage tracking

**Differentiator:**

- One of the few projects to offer per-key IP allowlisting natively
- Cursor skill integration for agent-side deployment

### 2.21 Other Notable Projects (Self-Hostable)

| Project | URL | Stars | Key Access Control Features |
|---|---|---|---|
OpenZiti LLM Gateway | github.com/openziti/llm-gateway | ~65 | Virtual API keys with model glob restrictions, zrok zero-trust overlay |
TensorZero | github.com/tensorzero/tensorzero | ~11.4K | Custom API keys, tag-based granular rate limits |
TrueFoundry AI Gateway | truefoundry.com/ai-gateway | N/A | Virtual Accounts, PATs, RBAC, policy-as-code (Cedar/OPA), compliance-grade audit logs |
RelayPlane | github.com/RelayPlane/proxy | ~200 | Provider-key-based (no virtual keys); cost-intelligence proxy; MCP support |
lazy-llm-proxy | github.com/Xu-pixel/lazy-llm-proxy | ~100 | Per-key budgets, IP allowlists, model allowlists |
InferXgate | inferxgate.com | New | Rust-based gateway with caching, analytics, cost optimization |
Barbacane | Dev.to article | — | Rust-native; composable plugins for AI proxy |
OpenClaw Gateway | openclaw.ai | — | Authentication, rate limiting, routing — open source |
API7 (API7.ai) | api7.ai | API Gateway | LLM request proxying, ai-proxy plugin with key validation and rate limiting |
llm-proxy-server | pypi.org/project/llm-proxy-server/ | — | Virtual API keys, user groups, OIDC integration |

### 2.22 Other Notable Projects (Managed SaaS Only)

| Project | URL | Type | Key Access Control Features |
|---|---|---|---|
OpenRouter | openrouter.ai | Marketplace | Per-user API keys with credit limits, workspace guardrails (2026), budget enforcement, zero data retention |
Kilo Gateway | kilo.ai/gateway | SaaS | One API key for 500+ models, BYOK support, organization-level access controls |
nexos.ai | nexos.ai | SaaS | Unified API to 200+ LLMs, governance/observability; founded by Nord Security creators ($35M funding) |
Cloudflare AI Gateway | cloudflare.com | Edge CDN | Basic analytics, caching, rate limiting at edge network; no self-host option |
n1n.ai | n1n.ai | SaaS | Single unified key for 500+ models |
AIMLAPI | aimlapi.com | SaaS | One API for 500+ models |
FreeLLMAPI | github.com/tashfeenahmed/freellmapi | SaaS | Unified API key, auto-selects best free provider |
Datawiza Access Proxy | datawiza.com | SaaS | LLM API key management, identity-aware rate limiting, virtual tokens, SSO/mTLS |
Requesty | requesty.ai | SaaS | API keys, routing policies, MCP gateway |
Braintrust Gateway | braintrust.dev | SaaS (beta) | Organization-scoped (`sk-` ) and project-scoped API keys; service tokens (`bt-st-` ); AES-GCM per-user caching |
LiteRouter | literouter.com | SaaS | Unlimited API creations, concurrent requests |

### 2.23 Projects Not Primarily LLM Proxies

| Project | URL | Scope | Notes |
|---|---|---|---|
Vellum | vellum.ai | LLMOps/prompt management | “Deployments” are thin API proxies for versioned prompts; not an LLM proxy in the LiteLLM sense |
Paperclip | github.com/paperclipai/paperclip | AI agent orchestration | Agent API keys, short-lived JWTs, per-agent budgets, org chart governance — delegates LLM routing to underlying tools |
Unify AI | unify.ai | Unified inference layer | Single API key management and access control; dynamic intent-aware model routing |

## 3. Comparison Matrix

### 3.1 Core Feature Comparison

| Project | Virtual Keys | Per-Key Budgets | Rate Limiting | RBAC | SSO/OIDC | MCP Support | Self-Hosted | Open Source |
|---|---|---|---|---|---|---|---|---|
LiteLLM | ✅ | ✅ | ✅ | Partial* | Paid only | ❌ | ✅ | MIT |
Bifrost | ✅ | ✅ | ✅ | ✅ | ✅ (Google/GitHub) | ✅ | ✅ | Apache 2.0 |
Portkey | ✅ | ✅ | ✅ | ✅ | Enterprise | ✅ | ✅ | MIT |
Helicone | ✅ | Limited | Limited | ❌ | Enterprise | ❌ | ✅ | Apache 2.0 |
VoidLLM | ✅ | ✅ | ✅ | ✅ | Enterprise (OIDC) | ✅ | ✅ | BSL 1.1 |
OmniRoute | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | MIT |
LLM-API-Key-Proxy | Single key | ❌ | Partial | ❌ | ❌ | ❌ | ✅ | MIT+LGPL |
Instawork llm-proxy | Per-key limits | ❌ | ✅ (exp.) | ❌ | ❌ | ❌ | ✅ | MIT |
LLM Gateway | ✅ | Enterprise | Enterprise | Enterprise | Enterprise | ❌ | ✅ | AGPLv3 |
llm-budget-proxy | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | MIT |
LM-Proxy | ✅ | ❌ | ✅ | Groups | ✅ (OIDC) | ❌ | ✅ | MIT |
LLM Security Gateway | Per-client key | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | — |
WSO2 AI Gateway | ✅ | ✅ (token-based) | ✅ (token-based) | ✅ | Enterprise | ✅ | ✅ | Apache 2.0 |
Kong AI Gateway | ✅ | Enterprise | Enterprise (tier) | ✅ | Enterprise | ✅ (plugin) | ✅ | Partial |
TensorZero | ✅ | ✅ (tags) | ✅ (granular) | ❌ | ❌ | ❌ | ✅ | Apache 2.0 |
TrueFoundry | ✅ (VAT/PAT) | ✅ (YAML rules) | ✅ (per-user/team/model/metadata) | ✅ | Enterprise | ✅ (Cedar/OPA) | ✅ (SaaS/VPC/on-prem) | Proprietary |
RelayPlane | ❌ | ✅ (free tier) | ✅ | ❌ | ❌ | ✅ (v1.0.0) | ✅ | MIT |
OpenZiti LLM Gateway | ✅ (glob models) | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | Apache 2.0 |
lazy-llm-proxy | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | MIT |

**LiteLLM’s RBAC and SSO are gated behind the enterprise license.*

### 3.4 Managed vs Self-Hosted Classification

| Category | Projects |
|---|---|
Self-hostable (open-source) | LiteLLM, Bifrost, Portkey (gateway), Helicone, VoidLLM, OmniRoute, LLM-API-Key-Proxy, Instawork llm-proxy, LLM Gateway (theopenco), llm-budget-proxy, LM-Proxy, LLM Security Gateway, TensorZero, RelayPlane, OpenZiti LLM Gateway, WSO2 AI Gateway, Kong |
Managed SaaS only | OpenRouter, Kilo Gateway, nexos.ai, Cloudflare AI Gateway, Braintrust Gateway (beta hosted), n1n.ai, AIMLAPI, FreeLLMAPI, LiteRouter, Requesty, Datawiza, Unify AI |
Hybrid (SaaS + self-host on enterprise) | Portkey (gateway MIT + managed platform), LLM Gateway (AGPLv3 + enterprise), Braintrust (self-host on enterprise plan), TrueFoundry (SaaS + VPC/on-prem) |

### 3.2 Performance Comparison

| Project | Language | Overhead | Throughput | Notes |
|---|---|---|---|---|
Bifrost | Go | ~8us P50, ~11us P95 | 5,000+ RPS | Fastest measured; compiled binary [TECHSY] |
TensorZero | Rust | <1ms P99 | 10,000+ QPS | LiteLLM @ 100 QPS adds 25-100x more latency [tensorzero.com/docs/gateway] |
Instawork llm-proxy | Go | Not published | — | Minimalist design |
VoidLLM | Go | Sub-2ms | — | Zero-knowledge architecture |
WSO2 AI Gateway | Go (Envoy) | Envoy-native | — | Established proxy infrastructure |
Helicone (Rust) | Rust | ~5ms P50, ~8ms P95 | ~3,000 RPS | Cloudflare Workers for logging [TECHSY] |
LiteLLM | Python | ~4ms P50, ~8ms P95 | ~1,000 RPS | GIL-bound under high concurrency [TECHSY] |
Portkey | TypeScript | ~5ms P50, ~12ms P95 | ~2,000 RPS | Claimed <1ms; independent benchmarks not found [TECHSY] |
Kong AI Gateway | Lua/Go | ~3ms P50, ~8ms P95 | ~3,000 RPS | Enterprise API management backbone [TECHSY] |
RelayPlane | Node.js | ~0ms (local) | — | In-process with app; no network hop [relayplane.com] |

**Source notes:**

- Bifrost benchmarks from DEV Community article by Pranay Batta (Jan 2026), with open-source benchmarking suite at github.com/maximhq/bifrost-benchmarking. On t3.medium @ 500 RPS: Bifrost p99=1.68s vs LiteLLM p99=90.72s (54x faster); throughput 424/s vs 44.84/s (9.4x higher). On t3.xlarge @ 5,000 RPS: Bifrost mean overhead=11µs vs LiteLLM ~500µs (45x higher). [DEV Community, Jan 16 2026]
- TECHSY comparison table from “Stop Juggling LLM APIs: 8 Gateway Tools Ranked for 2026” (Jun 6, 2026) [TECHSY]
- TensorZero benchmarks from official docs [tensorzero.com/docs/gateway]
**Important caveat**: All benchmarks above are self-reported by project authors. Independent third-party benchmarks across a common test harness do not yet exist for this market.

### 3.3 Provider Coverage

| Project | Provider Count | Notable Providers |
|---|---|---|
Portkey | 1,600+ | Largest catalog |
LiteLLM | 100+ | Bedrock, Azure, Vertex, Cohere, Sagemaker, HuggingFace, NVIDIA NIM |
Helicone | 100+ | OpenAI, Anthropic, Ollama, AWS Bedrock, Gemini |
OmniRoute | 160+ | 50+ free providers |
LLM Gateway | 210+ | OpenAI, Anthropic, Google Vertex AI |
Bifrost | 23+ | OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, Cerebras |
LM-Proxy | 4+ | OpenAI, Anthropic, Google AI, local PyTorch |
VoidLLM | 5+ | OpenAI, Anthropic, Azure, Ollama, vLLM |
Instawork llm-proxy | 4 | OpenAI, Anthropic, Gemini, AWS Bedrock |
TensorZero | 15+ direct, any OpenAI-compatible | OpenAI, Anthropic, AWS Bedrock, Azure, Groq, Mistral, vLLM, xAI |
OpenZiti LLM Gateway | 3+ | OpenAI, Anthropic, Ollama, vLLM, llama-server, SGLang |
RelayPlane | 11+ | Anthropic, OpenAI, Google, xAI, Moonshot, Ollama |
nexos.ai | 200+ | Via managed SaaS platform |
Kilo Gateway | 500+ | Anthropic, OpenAI, Mistral, and more |

## 4. API Key Architecture Patterns

### 4.1 Pattern A: Simple Proxy Key (Single Shared Key)

**Example**: LLM-API-Key-Proxy

A single `PROXY_API_KEY`

is shared by all clients. The proxy validates this key and forwards requests to upstream providers with their own keys (rotated automatically).

**Pros**: Simplest setup; one credential to manage** Cons**: No per-client attribution; no granular limits; if compromised, entire deployment exposed

### 4.2 Pattern B: Virtual Keys (Indirection Layer)

**Examples**: LiteLLM, Bifrost, Portkey, VoidLLM, llm-budget-proxy

The proxy accepts client-specific keys (e.g., `sk-virtual-...`

) and maps them internally to upstream provider credentials. Each virtual key can have its own budget, rate limits, and access scope.

**Pros**: Full per-client attribution; granular budgets; isolation between services** Cons**: More configuration; requires key management workflow

### 4.3 Pattern C: Per-Client JSON/DynamoDB Config

**Example**: LLM Security Gateway, LM-Proxy

Each client is configured with a set of API keys and settings stored in JSON or DynamoDB. The proxy validates against this store.

**Pros**: Flexible per-client configuration; can include model allowlists, provider selection** Cons**: Requires external store; no built-in dashboard

### 4.4 Pattern D: OIDC / Identity Provider Integration

**Examples**: LM-Proxy, Bifrost, VoidLLM (Enterprise), Portkey (Enterprise)

Client API keys are validated against an OIDC provider (Keycloak, Auth0, Okta). The token becomes the virtual API key.

**Pros**: Enterprise identity integration; SSO experience; existing IAM infrastructure** Cons**: Requires OIDC infrastructure; adds latency for validation

### 4.5 Pattern E: Hierarchical Organization

**Examples**: Bifrost, VoidLLM, Paperclip, LiteLLM (paid)

Keys are scoped within an organizational hierarchy: Customer → Team → User → Key. Each level can have its own budget and limits.

**Pros**: Enterprise-grade governance; cost allocation by org unit** Cons**: Complex configuration; often gated behind enterprise license

## 5. Granular Access Control — Deep Feature Analysis

This section goes beyond cataloging which projects support “virtual keys” to analyze *how granular* the access control actually is. The user’s core question — creating API keys to split up additional access control to more granular levels — requires examining per-model scoping, IP allowlisting, key expiration/rotation, webhook-based validation, and multi-tenancy isolation.

### 5.1 Per-Model / Per-Endpoint Scoping

Can you create a key that only accesses GPT-4 but not Claude? Only `/chat/completions`

but not `/embeddings`

?

| Project | Per-Model Scoping | Per-Endpoint Scoping | Mechanism |
|---|---|---|---|
LiteLLM | ✅ | ❌ | `model_list` config; per-key TPM/RPM limits [docs.litellm.ai/docs/proxy/users] |
Bifrost | ✅ | ❌ | Per-key model restrictions, spend caps at proxy layer; hierarchical Customer → Team → Virtual Key → Provider Config |
TensorZero | ✅ (via tags) | ❌ | Custom API keys with tag-based scoping; rate limits by tag scopes [tensorzero.com/docs/operations/set-up-auth-for-tensorzero] |
OpenZiti LLM Gateway | ✅ (glob) | ❌ | Keys restricted to models via glob patterns in config (`allowed_models: ["gpt-4o"]` ) [github.com/openziti/llm-gateway] |
LLM Security Gateway | ✅ | ❌ | Per-client model allowlists stored in JSON or DynamoDB |
TrueFoundry | ✅ | ❌ | RBAC scoped per provider account and model; policy-as-code with Cedar/OPA [truefoundry.com/blog/ai-governance-audit-enterprise-llm-gateway] |
Portkey | Partial | ❌ | Workspace-level access control; inbound rules for IPs/geographies but not fine-grained per-model at key level |
lazy-llm-proxy | ✅ | ❌ | Per-key model allowlists |

**Assessment**: LiteLLM, Bifrost, and TrueFoundry offer the most mature per-model scoping. OpenZiti uses glob patterns which are flexible but less precise than tag-based systems. None of the surveyed projects support true per-endpoint scoping (e.g., a key that can only call `/chat/completions`

but not `/embeddings`

). This is a gap in the market.

### 5.2 IP Allowlisting and CORS Restrictions

| Project | IP Allowlisting | Geographic Restrictions | CORS Control |
|---|---|---|---|
Portkey | ✅ | ✅ (geography) | ❌ |
lazy-llm-proxy | ✅ | ❌ | ❌ |
Most others | ❌ | ❌ | ❌ |

**Assessment**: IP allowlisting is notably absent from most LLM proxies. This is a significant gap for teams that need to restrict which networks can reach their proxy. Portkey is the only major project with built-in IP/geo controls. Others require deploying behind a network firewall or VPC.

### 5.3 Key Expiration and Rotation Policies

| Project | Auto-Expiry | One-Time-Use Keys | Auto-Rotation of Provider Keys | Webhook-Based Validation |
|---|---|---|---|---|
LiteLLM | ❌ | ❌ | ❌ | ❌ |
Bifrost | ❌ | ❌ | ❌ | ❌ |
TrueFoundry | ❌ (revocation) | ❌ | ✅ (VAT/PAT revocation without downtime) | ❌ |
TensorZero | ❌ | ❌ | ❌ | ❌ |
LLM-API-Key-Proxy | N/A | ❌ | ✅ (automatic multi-provider key rotation with cooldowns) | ❌ |
LM-Proxy (Nayjest) | ❌ | ❌ | ❌ | ✅ (`api_key_check` can reference external HTTP service) [pypi.org/project/lm-proxy/] |

**Assessment**: No surveyed project supports auto-expiry or one-time-use keys. Key rotation of upstream provider credentials is only handled by LLM-API-Key-Proxy (automatic multi-provider rotation). Webhook-based key validation — useful for integrating with custom IAM systems — is only available in LM-Proxy (Nayjest) via configurable `api_key_check`

functions or external HTTP services.

### 5.4 Multi-Tenancy Isolation Strength

| Project | Isolation Model | Org/Team Hierarchy | Physical Isolation Option |
|---|---|---|---|
Bifrost | Logical (tenant_id in DB) | ✅ Full CRUD for org/team/business unit | ❌ |
TrueFoundry | Logical (VAT/PAT) | ✅ Virtual Accounts, per-provider RBAC | ✅ VPC/on-prem/air-gapped deployment |
LiteLLM | Logical (team_id in PostgreSQL) | ✅ Teams with shared budgets | ❌ |
Portkey | Logical (workspace) | ✅ Workspaces | ❌ |
Braintrust | Logical (org/project) | ✅ Organization-scoped and project-scoped keys | ❌ (enterprise plan for self-host) |
TensorZero | Logical (tags/namespaces) | ❌ (no org hierarchy) | ❌ |
OpenZiti LLM Gateway | Flat config | ❌ | ❌ |

**Assessment**: TrueFoundry is the only project offering both logical multi-tenancy AND physical isolation options (VPC, on-prem, air-gapped). Bifrost offers the most granular org/team hierarchy with full CRUD. LiteLLM’s team-based model is widely used but provides only logical separation within a shared PostgreSQL database.

## 7. Security Considerations

### 5.1 The Supply-Chain Risk

LiteLLM’s March 2026 breach demonstrated that even popular open-source projects are vulnerable to supply-chain attacks through CI/CD pipeline compromise (poisoned GitHub Action → stolen PyPI credentials). Projects with smaller dependency trees and compiled binaries (Go, Rust) present a lower attack surface:

**Bifrost**(Go, Apache 2.0): Single binary, minimal dependencies** VoidLLM**(Go, BSL 1.1): Compiled binary, OpenSSF Scorecard** WSO2 AI Gateway**(Go, Envoy): Enterprise-grade supply chain practices** TensorZero**(Rust, Apache 2.0): Compiled binary; team includes Rust compiler maintainer** OpenZiti LLM Gateway**(Go, Apache 2.0): Single binary, no DB, minimal attack surface** RelayPlane**(Node.js, MIT): npm-native, local-first — no server-side attack surface for remote adversaries

### 5.2 Prompt/Data Privacy

Most proxies log request/response metadata for observability. However:

**VoidLLM** is architecturally zero-knowledge — never stores or logs prompt/response content**LiteLLM**,** Portkey**, and** Helicone**store full request/response data by default (configurable)** LLM Security Gateway**includes PII scanning with configurable redact/block/log actions** Braintrust Gateway**: Uses AES-GCM encryption tied to each user’s API key; cached results scoped to individual user; Braintrust cannot see your data and does not store or log API keys**nexos.ai**: Founded by Nord Security creators, positions data governance as core value proposition

### 5.3 Content Security

Only the LLM Security Gateway provides built-in injection detection (20 patterns across 4 categories) and response scanning. Other projects leave this to external guardrails (Portkey has 50+ guardrails, LiteLLM has guardrails in enterprise tier). TrueFoundry adds policy-as-code guardrails (Cedar/OPA) at the MCP-tool boundary.

### 5.2 Prompt/Data Privacy

Most proxies log request/response metadata for observability. However:

**VoidLLM** is architecturally zero-knowledge — never stores or logs prompt/response content**LiteLLM**,** Portkey**, and** Helicone**store full request/response data by default (configurable)** LLM Security Gateway**includes PII scanning with configurable redact/block/log actions

### 5.3 Content Security

Only the LLM Security Gateway provides built-in injection detection (20 patterns across 4 categories) and response scanning. Other projects leave this to external guardrails (Portkey has 50+ guardrails, LiteLLM has guardrails in enterprise tier).

## 8. Emerging Trends

### 6.1 MCP Gateway as a New Governance Frontier

As AI agents proliferate, they need governed access to external tools and APIs. Bifrost, Portkey, VoidLLM, WSO2, Kong, RelayPlane, and TrueFoundry all now offer MCP gateway capabilities:

**Bifrost**: MCP client + server with OAuth 2.0 auth, tool filtering per virtual key** Portkey**: Centralized control plane for MCP servers with access control and observability** VoidLLM**: External MCP server proxying with scoped access control; Code Mode (WASM-sandboxed JS)** WSO2**: Converts REST APIs to MCP-compatible servers** Kong**: AI MCP Proxy Plugin with OAuth2 authentication** TrueFoundry**: MCP Gateway with OAuth 2.0 secured access; policy-as-code guardrails (Cedar/OPA) at MCP-tool boundary** RelayPlane**: MCP server support shipped in v1.0.0

### 6.2 Token-Aware Rate Limiting

Traditional API gateways rate-limit on request count. LLM costs scale with tokens, not requests. Projects like WSO2 and Instawork llm-proxy implement token-based rate limiting calibrated to how LLMs actually charge.

### 6.3 Semantic Caching

Bifrost, Portkey, and Helicone offer semantic caching (embedding-based similarity search) in addition to exact-match caching, reducing costs for repeated or near-identical queries.

### 6.4 Model Downgrade on Budget Pressure

llm-budget-proxy introduces automatic model downgrade (e.g., GPT-4 → GPT-4o-mini) when approaching budget thresholds — a cost optimization pattern that could spread to other gateways.

## 9. Operational Complexity and Deployment Comparison

The user’s use case (granular access control via API keys) implies a production deployment scenario. The following tables compare operational complexity across the most relevant projects.

### 7.1 Infrastructure Dependencies

| Project | Database Required | Cache Required | Message Queue | Cloud Provider Lock-in |
|---|---|---|---|---|
LiteLLM | PostgreSQL (required for team features) | Redis (recommended) | ❌ | ❌ |
Bifrost | PostgreSQL (for governance features) | ❌ | ❌ | ❌ |
TensorZero | Postgres (optional), ClickHouse (optional), Valkey/Redis (optional) | Optional (Valkey/Redis) | ❌ | ❌ |
TrueFoundry | Managed by platform | Managed | ❌ | Partial (SaaS on AWS) |
Helicone | Cloudflare Workers (managed) or self-hosted DB | Cloudflare cache | ❌ | Yes (Cloudflare for managed) |
OpenZiti LLM Gateway | None — single binary | ❌ | ❌ | ❌ |
RelayPlane | None — npm package | ❌ | ❌ | ❌ |
llm-budget-proxy | SQLite only | ❌ | ❌ | ❌ |
Portkey (self-hosted) | Depends on managed features used | Redis (recommended) | ❌ | ❌ |
WSO2 AI Gateway | PostgreSQL/MySQL | Redis | Kafka (optional for async logging) | ❌ |

### 7.2 Kubernetes Readiness

| Project | Official K8s Support | Helm Chart | Operator | Notes |
|---|---|---|---|---|
LiteLLM | ✅ | ✅ (official) | ❌ | Well-documented; large community |
Bifrost | ✅ | ✅ | ❌ | Docker + Helm charts available |
TensorZero | ✅ | ✅ (examples/) | ❌ | K8s/Helm/Argo examples on GitHub |
TrueFoundry | ✅ | N/A (platform-managed) | ✅ | Kubernetes-native; GPU orchestration built-in |
Helicone | ✅ | ✅ | ❌ | Self-hosting noted as “not recommended” for manual deployment |
WSO2 AI Gateway | ✅ | ✅ | ✅ | Built on Envoy Proxy — established K8s patterns |
Kong AI Gateway | ✅ | ✅ | ✅ | Mature Kubernetes operators |
Portkey (self-hosted) | ✅ | ❌ | ❌ | Docker + Cloudflare Workers deployment |
OpenZiti LLM Gateway | ❌ | ❌ | ❌ | Not officially documented for K8s |
RelayPlane | ❌ | ❌ | ❌ | Local npm package, not a service |
llm-budget-proxy | ❌ | ❌ | ❌ | Single Docker container |

### 7.3 Onboarding Time Estimates

| Project | Setup Time | Complexity Level | Notes |
|---|---|---|---|
RelayPlane | ~30 seconds | Trivial | `npm install -g @relayplane/proxy` — local proxy, no infra |
OpenZiti LLM Gateway | 2–3 minutes | Low | One YAML config file; single binary |
llm-budget-proxy | ~5 minutes | Low | Single Docker container with SQLite |
TensorZero | ~5 minutes (quickstart) | Medium | GitOps-friendly config; optional DBs |
Bifrost | <1 minute (Docker) | Low | Web UI configuration; single command deploy |
LiteLLM | 20–30 minutes | Medium-High | Requires PostgreSQL + Redis; YAML config complexity |
Portkey (self-hosted) | 10–15 minutes | Medium | Docker/Node.js; moderate learning curve |
TrueFoundry (SaaS) | Account creation | Low | Enterprise: sales discussion for VPC/on-prem deployment |
WSO2 AI Gateway | 30–60 minutes | High | Envoy-based; full platform integration |

## 10. Recommendations by Use Case

### For Enterprise Teams Needing Full Governance

**TrueFoundry** or **Bifrost**

**TrueFoundry**: Most complete governance feature set — virtual accounts, RBAC per provider account, policy-as-code (Cedar/OPA), compliance-grade audit logs with SIEM export, data residency routing. SOC 2 Type 2, HIPAA, ITAR certified. VPC/on-prem/air-gapped deployment options. Recognized in Gartner Market Guide for AI Gateways 2026.**Bifrost**: Virtual keys with hierarchical budgets (Customer → Team → Virtual Key → Provider Config), RBAC, Google/GitHub SSO (open-source, not paywalled), HashiCorp Vault integration, OIDC provisioning. 11µs overhead at 5,000 RPS. MCP gateway with tool filtering per virtual key.**WSO2 AI Gateway**: Alternative for teams already on WSO2 platform; Envoy-based with SOC 2 Type 2 and ISO 27001 compliance; token-based rate limiting calibrated to LLM pricing.

### For Teams Already on Kong

**Kong AI Gateway**

- Natural adoption path; extend existing API management to LLM traffic
- Token-based rate limiting (enterprise tier), RBAC, audit logs, MCP proxy plugin

### For Quick Setup / Single Provider

**llm-budget-proxy** or **RelayPlane**

- llm-budget-proxy: 5-minute Docker deploy, SQLite, perfect for dev/staging with per-key token budgets and Slack/Discord alert webhooks
- RelayPlane: 30-second npm install, local-first (zero network hop), MCP support — ideal for coding agent workflows

### For Privacy-Conscious Deployments

**VoidLLM** or **OpenZiti LLM Gateway**

- VoidLLM: Zero-knowledge architecture by design; never stores or logs prompt/response content; sub-2ms proxy overhead; RBAC with org/team/user scoping
- OpenZiti LLM Gateway: Zero-trust networking via zrok/OpenZiti overlay; single binary, no DB; model-level key restrictions via glob patterns

### For Observability-First Teams

**Helicone** or **Portkey**

- Helicone: YC-backed, Rust-based observability-first with light gateway features; ~3,000 RPS per instance
- Portkey: Strong guardrails (50+ pre-built checks), full LLMOps platform; now fully open-source (March 2026)

### For Maximum Provider Coverage

**Portkey** (1,600+ models) or **LiteLLM** (100+ providers)

### For ML-Optimized Routing

**TensorZero**

- <1ms P99 latency at 10,000+ QPS (Rust); tag-based granular rate limits; Autopilot automated optimization; GitOps-friendly orchestration

### For Minimalist / Single-Concern Proxies

**Security**: LLM Security Gateway (injection detection, PII scanning)** Budget enforcement**: llm-budget-proxy (per-key daily/monthly USD budgets)** Key rotation**: LLM-API-Key-Proxy (automatic multi-provider key rotation)** Circuit breaking**: Instawork llm-proxy (per-key circuit breaker with provider rollup)** Zero-trust networking**: OpenZiti LLM Gateway (zrok overlay, no firewall rules)** Local-first proxy**: RelayPlane (npm-native, in-process with your app)

### For Webhook-Based Key Validation

**LM-Proxy (Nayjest)**

- Extensible
`api_key_check`

can reference a Python function or external HTTP service for custom authentication logic — unique among surveyed projects

### For Per-Key IP Allowlisting

**Portkey** or **lazy-llm-proxy**

- Portkey: Inbound rules for IPs and geographies on deployments
- lazy-llm-proxy: Per-key IP whitelisting (rare feature)

## 11. Decision Framework by Requirement

The following matrix maps specific requirements to top candidates with pros and cons for each scenario.

| Requirement | Top Pick | Runner-Up | Key Trade-off |
|---|---|---|---|
Per-model scoping | LiteLLM / Bifrost | TrueFoundry (RBAC per provider account) | LiteLLM: widest provider coverage; Bifrost: lower latency |
Per-endpoint scoping | None available | — | No project supports restricting keys to specific API endpoints (e.g., `/chat/completions` only) |
IP allowlisting | Portkey | lazy-llm-proxy | Portkey requires managed platform; lazy-llm-proxy is small/new |
Auto key expiry | None available | — | Gap in the market; all projects use manual management |
One-time-use keys | None available | — | Gap in the market |
Webhook-based validation | LM-Proxy (Nayjest) | — | Only project with configurable external HTTP validation |
Physical isolation (air-gap) | TrueFoundry | WSO2 AI Gateway | TrueFoundry: VPC/on-prem/air-gapped; WSO2: Envoy-based K8s native |
Lowest overhead | TensorZero (<1ms P99) | Bifrost (11µs @ 5k RPS) | TensorZero: steeper learning curve; Bifrost: smaller provider coverage (23+) |
Fastest setup | RelayPlane (30s) | OpenZiti (2-3 min) | RelayPlane: local-only, not a service; OpenZiti: no K8s support |
Maximum provider coverage | Portkey (1,600+) | LiteLLM (100+ providers) | Portkey: managed platform pricing; LiteLLM: Python GIL limits |
MCP governance | TrueFoundry / Bifrost | Portkey / VoidLLM | TrueFoundry: Cedar/OPA policy-as-code; Bifrost: open-source, not paywalled |
Multi-tenant org hierarchy | Bifrost (full CRUD) | TrueFoundry (VAT/PAT) | Bifrost: logical only; TrueFoundry: physical isolation options |

### For Teams Already on Kong

**Kong AI Gateway**

- Natural adoption path; extend existing API management to LLM traffic

### For Quick Setup / Single Provider

**llm-budget-proxy** or **LiteLLM**

- llm-budget-proxy: 5-minute Docker deploy, SQLite, perfect for dev/staging
- LiteLLM: 100+ providers, most mature feature set

### For Privacy-Conscious Deployments

**VoidLLM**

- Zero-knowledge architecture by design
- Sub-2ms proxy overhead
- RBAC with org/team/user scoping

### For Observability-First Teams

**Helicone** or **Portkey**

- Helicone: YC-backed, observability-first with light gateway features
- Portkey: Strong guardrails, 50+ pre-built checks, full LLMOps platform

### For Maximum Provider Coverage

**Portkey** (1,600+ models) or **LiteLLM** (100+ providers)

### For Minimalist / Single-Concern Proxies

**Security**: LLM Security Gateway** Budget enforcement**: llm-budget-proxy** Key rotation**: LLM-API-Key-Proxy** Circuit breaking**: Instawork llm-proxy

## 13. Methodology Note

This research was conducted on June 9, 2026 through systematic web searches using the `mcp__search__web_search`

tool with approximately 40+ distinct search queries across these categories:

**Technical terms**: “LLM proxy”, “AI gateway”, “LLM router”, “LLM middleware”
**Lay phrasings**: “proxy LLM APIs”, “unified LLM API”
**Named-entity queries**: Project names discovered through comparison articles (Bifrost, TensorZero, TrueFoundry, RelayPlane, OpenZiti, etc.)
**Feature-specific queries**: “virtual keys budgets RBAC”, “per-model per-endpoint API key scoping IP allowlist expiration rotation”, “LLM proxy webhook key validation”
**Benchmark queries**: “Bifrost LiteLLM benchmark independent verification”, “TensorZero benchmarks latency”
**Missing competitor discovery**: “Vellum AI platform LLM routing proxy API keys open source”, “LangFuse self-hosted API key scoping multi-tenant access control”, “Unify AI unified inference layer API key management access control”, “TrueFoundry AI gateway API key management virtual keys RBAC”, “Braintrust gateway API key management multi-tenant access control”, “Kilo Gateway universal LLM inference API key management”, “nexos.ai unified AI API gateway access control”

**Primary sources fetched and read in full** (via `mcp__search__web_fetch`

):

- TECHSY, “Stop Juggling LLM APIs: 8 Gateways Ranked for 2026” (Jun 6, 2026)
- getmaxim.ai, “Best LLM Gateways in 2026” (Apr 14, 2026)
- Braintrust.dev, “6 best LLM gateways for developers in 2026” (May 16, 2026)
- TrueFoundry, “Top 6 LLM Gateways in 2026” (Sep 23, 2025)
- RelayPlane, “LLM Gateway Comparison” (Mar 2026)
- DEV Community, “How We Benchmarked Bifrost against LiteLLM” by Pranay Batta (Jan 16, 2026)
- TensorZero docs, “Gateway Overview”
- TrueFoundry, “AI Governance and Audit for Enterprise LLMs” (Jun 8, 2026)
- Braintrust, “Use the Braintrust gateway” docs
- Kilo Gateway landing page (kilo.ai/gateway)
- OpenZiti LLM Gateway README (github.com/openziti/llm-gateway)
- Vellum platform overview (deepchecks.com)

**Benchmark verification**: All performance claims are attributed to their source. Bifrost’s “54x faster” claim comes from self-reported benchmarks with an open-source benchmarking suite (github.com/maximhq/bifrost-benchmarking). TensorZero’s “<1ms P99” claims come from official docs. TECHSY’s performance overhead table provides a cross-project comparison. Independent third-party benchmarks across a common test harness do not yet exist for this market — all numbers should be treated as self-reported until independently verified.

**Limitations**: The research focused on open-source and self-hostable projects primarily, with managed-only platforms tracked separately. Some smaller or newer projects may have been missed. Provider counts and star counts are approximate.

## 14. References

- Sonatype, “Compromised LiteLLM PyPI Package Delivers Multi-Stage Credential Stealer,” March 2026.
[https://www.sonatype.com/blog/compromised-litellm-pypi-package-delivers-multi-stage-credential-stealer](https://www.sonatype.com/blog/compromised-litellm-pypi-package-delivers-multi-stage-credential-stealer) - TechCrunch, “Delve Accused of Misleading Customers with Fake Compliance,” March 2026.
[https://techcrunch.com/2026/03/22/delve-accused-of-misleading-customers-with-fake-compliance/](https://techcrunch.com/2026/03/22/delve-accused-of-misleading-customers-with-fake-compliance/) - LiteLLM Documentation, “Budgets, Rate Limits.”
[https://docs.litellm.ai/docs/proxy/users](https://docs.litellm.ai/docs/proxy/users) - LiteLLM Documentation, “Budgets, Rate Limits.” (also referenced in [3]) — budget management features for virtual keys.
- Bifrost Documentation, “Governance — Virtual Keys, Budgets & Enterprise RBAC.”
[https://www.getmaxim.ai/bifrost/resources/governance](https://www.getmaxim.ai/bifrost/resources/governance) - Bifrost Documentation, “Enterprise — User Provisioning (OIDC).”
[https://docs.getbifrost.ai/enterprise/user-provisioning](https://docs.getbifrost.ai/enterprise/user-provisioning) - Portkey Documentation, “Role-based access control.”
[https://portkey.ai/docs/product/ai-gateway/rbac](https://portkey.ai/docs/product/ai-gateway/rbac) - Portkey Blog, “Manage LLM API keys with secret references,” April 2026.
[https://portkey.ai/blog/secret-references-ai-api-key-management/](https://portkey.ai/blog/secret-references-ai-api-key-management/) - VoidLLM README.
[https://github.com/voidmind-io/voidllm](https://github.com/voidmind-io/voidllm) - OmniRoute GitHub, v1.4.0 release notes — “Dedicated API Key Manager.”
[https://newreleases.io/project/github/diegosouzapw/OmniRoute/release/v1.4.0](https://newreleases.io/project/github/diegosouzapw/OmniRoute/release/v1.4.0) - LiteLLM GitHub.
[https://github.com/BerriAI/litellm](https://github.com/BerriAI/litellm) - Bifrost GitHub.
[https://github.com/maximhq/bifrost](https://github.com/maximhq/bifrost) - Portkey AI Gateway GitHub.
[https://github.com/Portkey-AI/gateway](https://github.com/Portkey-AI/gateway) - Helicone GitHub.
[https://github.com/Helicone/helicone](https://github.com/Helicone/helicone) - VoidLLM GitHub.
[https://github.com/voidmind-io/voidllm](https://github.com/voidmind-io/voidllm) - OmniRoute GitHub.
[https://github.com/diegosouzapw/OmniRoute](https://github.com/diegosouzapw/OmniRoute) - LLM-API-Key-Proxy GitHub.
[https://github.com/Mirrowel/LLM-API-Key-Proxy](https://github.com/Mirrowel/LLM-API-Key-Proxy) - Instawork llm-proxy GitHub.
[https://github.com/Instawork/llm-proxy](https://github.com/Instawork/llm-proxy) - LLM Gateway (theopenco) GitHub.
[https://github.com/theopenco/llmgateway](https://github.com/theopenco/llmgateway) - llm-budget-proxy GitHub.
[https://github.com/InkByteStudio/llm-budget-proxy](https://github.com/InkByteStudio/llm-budget-proxy) - LM-Proxy (Nayjest) GitHub.
[https://github.com/Nayjest/lm-proxy](https://github.com/Nayjest/lm-proxy) - LLM Security Gateway GitHub.
[https://github.com/TerminalsandCoffee/llm-security-gateway](https://github.com/TerminalsandCoffee/llm-security-gateway) - Paperclip GitHub.
[https://github.com/paperclipai/paperclip](https://github.com/paperclipai/paperclip) - WSO2 AI Gateway GitHub.
[https://github.com/wso2/wso2-envoy-ai-gateway](https://github.com/wso2/wso2-envoy-ai-gateway) - WSO2 Blog, “Best LiteLLM Alternatives in 2026: Secure AI Gateways,” April 2026.
[https://wso2.com/library/blogs/litellm-alternatives/](https://wso2.com/library/blogs/litellm-alternatives/) - getmaxim.ai, “Top LiteLLM Alternatives in 2026,” April 2026.
[https://www.getmaxim.ai/articles/top-litellm-alternatives-in-2026/](https://www.getmaxim.ai/articles/top-litellm-alternatives-in-2026/) - DEV Community, “Top LiteLLM Alternatives in 2026” by Kuldeep Paul, March 2026.
[https://dev.to/kuldeep_paul/top-litellm-alternatives-in-2026-1gi1](https://dev.to/kuldeep_paul/top-litellm-alternatives-in-2026-1gi1) - DEV Community, “This Open-Source LLM Gateway is 54x Faster Than LiteLLM” by Deb McKinney, January 2026.
[https://dev.to/debmckinney/this-open-source-llm-gateway-is-54x-faster-than-litellm-heres-why-1h](https://dev.to/debmckinney/this-open-source-llm-gateway-is-54x-faster-than-litellm-heres-why-1h) - Pomerium Blog, “LiteLLM Alternatives: Best Open-Source and Secure LLM Gateways in 2025.”
[https://www.pomerium.com/blog/litellm-alternatives](https://www.pomerium.com/blog/litellm-alternatives) - TECHSY, “Stop Juggling LLM APIs: 8 Gateways Ranked 2026,” June 2026.
[https://techsy.io/en/blog/best-llm-gateway-tools](https://techsy.io/en/blog/best-llm-gateway-tools) - TrueFoundry, “Top 5 LiteLLM Alternatives for Enterprises in 2026,” January 2026.
[https://www.truefoundry.com/blog/litellm-alternatives](https://www.truefoundry.com/blog/litellm-alternatives) - OpenRouter Documentation, “Organization Management.”
[https://openrouter.ai/docs/cookbook/administration/organization-management](https://openrouter.ai/docs/cookbook/administration/organization-management) - OpenRouter Release Notes, June 2026 — “workspace guardrails for budget limits, zero data retention.”
[https://releasebot.io/updates/openrouter](https://releasebot.io/updates/openrouter) - Datawiza Blog, “LLM API Key Management and Identity-Aware Rate Limiting,” May 2026.
[https://www.datawiza.com/blog/industry/llm-api-key-management-and-identity-aware-rate-limiting/](https://www.datawiza.com/blog/industry/llm-api-key-management-and-identity-aware-rate-limiting/) - Requesty.
[https://www.requesty.ai](https://www.requesty.ai) - LLM Security Gateway Medium article by Terminals & Coffee, February 2026.
[https://medium.com/@terminalsandcoffee/i-built-a-security-proxy-for-llm-apis-8c44f7c26730](https://medium.com/@terminalsandcoffee/i-built-a-security-proxy-for-llm-apis-8c44f7c26730) - InkByteStudio, “LLM API Rate Limiting & Cost Control: Token Budgets & Throttling,” March 2026.
[https://igotasite4that.com/blog/llm-api-rate-limiting-cost-control/](https://igotasite4that.com/blog/llm-api-rate-limiting-cost-control/) - Paperclip.ing.
[https://paperclip.ing](https://paperclip.ing) - API7 (API7.ai) Learning Center.
[https://api7.ai/learning-center/api-gateway-guide/api-gateway-proxy-llm-requests](https://api7.ai/learning-center/api-gateway-guide/api-gateway-proxy-llm-requests) - OpenClaw Gateway Authentication.
[https://docs.openclaw.ai/gateway/authentication](https://docs.openclaw.ai/gateway/authentication) - LiteRouter.
[https://literouter.com](https://literouter.com) - n1n.ai.
[https://n1n.ai](https://n1n.ai) - AIMLAPI.
[https://aimlapi.com](https://aimlapi.com) - FreeLLMAPI GitHub.
[https://github.com/tashfeenahmed/freellmapi](https://github.com/tashfeenahmed/freellmapi) - TensorZero, “Gateway Overview.”
[https://www.tensorzero.com/docs/gateway](https://www.tensorzero.com/docs/gateway) - TensorZero, “Set up auth for TensorZero” (operations docs).
[https://www.tensorzero.com/docs/operations/set-up-auth-for-tensorzero](https://www.tensorzero.com/docs/operations/set-up-auth-for-tensorzero) - TensorZero GitHub.
[https://github.com/tensorzero/tensorzero](https://github.com/tensorzero/tensorzero) - TrueFoundry, “AI Governance and Audit for Enterprise LLMs: Virtual Keys, RBAC, and Compliance-Grade Logs,” June 8, 2026.
[https://www.truefoundry.com/blog/ai-governance-audit-enterprise-llm-gateway](https://www.truefoundry.com/blog/ai-governance-audit-enterprise-llm-gateway) - TrueFoundry AI Gateway overview.
[https://www.truefoundry.com/ai-gateway](https://www.truefoundry.com/ai-gateway) - RelayPlane, “LLM Gateway Comparison.”
[https://relayplane.com/compare/llm-gateways](https://relayplane.com/compare/llm-gateways) - RelayPlane GitHub.
[https://github.com/RelayPlane/proxy](https://github.com/RelayPlane/proxy) - OpenZiti LLM Gateway README.
[https://github.com/openziti/llm-gateway](https://github.com/openziti/llm-gateway) - Kilo Gateway landing page.
[https://kilo.ai/gateway](https://kilo.ai/gateway) - nexos.ai API SDK features.
[https://nexos.ai/features/api-sdk/](https://nexos.ai/features/api-sdk/) - Braintrust, “Use the Braintrust gateway” docs.
[https://www.braintrust.dev/docs/deploy/gateway](https://www.braintrust.dev/docs/deploy/gateway) - Braintrust, “Organizations and Authentication.”
[https://deepwiki.com/braintrustdata/braintrust-go/4.5-organizations-and-authentication](https://deepwiki.com/braintrustdata/braintrust-go/4.5-organizations-and-authentication) - DEV Community, “How We Benchmarked Bifrost against LiteLLM” by Pranay Batta, January 16, 2026.
[https://dev.to/pranay_batta/how-we-benchmarked-bifrost-against-litellmand-what-we-learned-about-performance-c1o](https://dev.to/pranay_batta/how-we-benchmarked-bifrost-against-litellmand-what-we-learned-about-performance-c1o) - getmaxim.ai, “Best LLM Gateways in 2026,” April 14, 2026.
[https://www.getmaxim.ai/articles/best-llm-gateways-in-2026/](https://www.getmaxim.ai/articles/best-llm-gateways-in-2026/) - Braintrust.dev, “6 best LLM gateways for developers in 2026,” May 16, 2026.
[https://www.braintrust.dev/articles/best-llm-gateways-2026](https://www.braintrust.dev/articles/best-llm-gateways-2026) - TrueFoundry, “Top 6 LLM Gateways in 2026.”
[https://www.truefoundry.com/blog/best-llm-gateways](https://www.truefoundry.com/blog/best-llm-gateways) - deepchecks.com, “What is Vellum AI? Features & Getting Started,” January 2025.
[https://deepchecks.com/llm-tools/vellum-ai/](https://deepchecks.com/llm-tools/vellum-ai/) - lazy-llm-proxy GitHub.
[https://github.com/Xu-pixel/lazy-llm-proxy](https://github.com/Xu-pixel/lazy-llm-proxy) - Forbes, “Nord Security Founders Launch Nexos.ai For Governed Enterprise AI,” November 2025.
[https://www.forbes.com/sites/ronschmelzer/2025/11/25/nord-security-founders-launch-nexosai-for-governed-enterprise-ai/](https://www.forbes.com/sites/ronschmelzer/2025/11/25/nord-security-founders-launch-nexosai-for-governed-enterprise-ai/) - Vellum Documentation.
[https://docs.vellum.ai/](https://docs.vellum.ai/) - Braintrust.dev, “How we chose the best LLM gateways.”
[https://www.braintrust.dev/articles/best-llm-gateways-2026#how-we-chose-the-best-llm-gateways](https://www.braintrust.dev/articles/best-llm-gateways-2026#how-we-chose-the-best-llm-gateways)

## Share this article

## Related writing

[The Landscape of MITM Proxy and HTTP Interception Tools: A Comprehensive Survey of Projects Similar to mitmproxy and oproxy](/2026/06/The-Landscape-of-MITM-Proxy-and-HTTP-Interception-Tools-A-Comprehensive-Survey-of-Projects-Similar-to-mitmproxy-and-oproxy/)

A research report examining the ecosystem of man-in-the-middle proxy tools, HTTP debugging proxies, and network traffic interception frameworks — their architectures, capabilities, trade-offs, and positioning in the developer and security toolchain.

This report maps the ecosystem of man-in-the-middle (MITM) proxy and HTTP interception tools, benchmarked against two reference projects:...

[Read article](/2026/06/The-Landscape-of-MITM-Proxy-and-HTTP-Interception-Tools-A-Comprehensive-Survey-of-Projects-Similar-to-mitmproxy-and-oproxy/)

[The Anti-Scraper Ecosystem: A Comprehensive Survey of Open-Source Projects for Browser Stealth and Fingerprint Evasion](/2026/06/The-Anti-Scraper-Ecosystem-A-Comprehensive-Survey-of-Open-Source-Projects-for-Browser-Stealth-and-Fingerprint-Evasion/)

A deep survey of 40+ open-source projects for browser stealth, fingerprint evasion, and anti-bot detection — from C++ browser patches and CDP-minimal frameworks to Rust-native headless engines and AI-agent-integrated stealth browsers.

This report catalogs and analyzes the complete landscape of open-source GitHub projects that provide anti-scraper detection, browser fingerprint...

[Read article](/2026/06/The-Anti-Scraper-Ecosystem-A-Comprehensive-Survey-of-Open-Source-Projects-for-Browser-Stealth-and-Fingerprint-Evasion/)

[Claude Code: Features, Commands, Architecture and Best Practices](/2026/05/Claude-Code-Features-Commands-Architecture-and-Best-Practices/)

A comprehensive analysis of every feature from basic to advanced in Anthropic's Claude Code agentic coding environment.

This report analyzes Claude Code's complete feature set, architecture, and best practices for effective usage. Below are the five most actionable...

[Read article](/2026/05/Claude-Code-Features-Commands-Architecture-and-Best-Practices/)

## Search

Search by title, subtitle, tags, categories, authors, or body text.
