LLM Proxy Projects with Granular API Key Access Control: A Comprehensive Survey A new research report analyzing 38 open-source and managed LLM proxy projects finds the market rapidly maturing from simple protocol translators into comprehensive governance platforms. LiteLLM leads with the most complete feature set including virtual keys per user and team, while newer entrants like Bifrost and TensorZero offer lower latency and enterprise-grade access controls in their open-source tiers. The landscape is bifurcating between self-hostable proxies and managed SaaS platforms, with implications for data sovereignty and compliance. Post LLM Proxy Projects with Granular API Key Access Control: A Comprehensive Survey A research report examining 38+ open-source and managed LLM proxy/gateway projects that provide API key management and granular access control — from LiteLLM and Bifrost to TensorZero and TrueFoundry, comparing their architectures, features, and trade-offs. Executive Summary This report identifies and analyzes 38+ open-source and managed LLM proxy/gateway projects that provide API key management and granular access control capabilities. The market has matured rapidly from simple protocol-translating proxies into comprehensive governance platforms. The leading project, LiteLLM ~40K GitHub stars , offers the most complete feature set with virtual keys scoped per user, team, and budget — but faces performance limitations Python GIL and a supply-chain security breach in March 2026. Newer projects like Bifrost 5.6K stars, Go-based claim 54× lower latency and deliver enterprise governance features in their open-source tier without paywalling RBAC or SSO. A significant expansion from the prior survey includes several notable new entrants: TensorZero ~11.4K stars, Rust-based, <1ms P99 latency brings an ML-optimized gateway with tag-based rate limits; TrueFoundry offers enterprise-grade virtual accounts, RBAC, policy-as-code Cedar/OPA , and SOC 2/HIPAA/ITAR compliance with on-prem deployment; RelayPlane npm-native, Node.js provides a local-first cost-intelligence proxy with MCP support and zero network-hop overhead; OpenZiti LLM Gateway Go, Apache 2.0 delivers virtual API keys with model-level glob restrictions and zero-trust networking via zrok overlay. Additionally, managed-only platforms like Kilo Gateway 500+ models, BYOK , nexos.ai founded by Nord Security creators, $35M funding , and Braintrust Gateway organization-scoped and project-scoped API keys with AES-GCM caching are tracked separately from self-hostable projects. The landscape spans four architectural categories: Python libraries LiteLLM, Portkey, LM-Proxy , Go binaries Bifrost, Instawork llm-proxy, VoidLLM, OpenZiti , TypeScript services Helicone, LLM Gateway, OmniRoute , Rust gateways TensorZero, Helicone’s Rust rewrite , and enterprise-grade platforms built on Envoy WSO2, Kong . A significant trend is the emergence of MCP Model Context Protocol gateway capabilities as a new governance frontier — Bifrost, Portkey, VoidLLM, WSO2, RelayPlane, and TrueFoundry all now offer MCP tool management with access control. Critically, the market is bifurcating between self-hostable open-source proxies LiteLLM, Bifrost, TensorZero, VoidLLM, etc. and managed SaaS-only platforms OpenRouter, Kilo Gateway, nexos.ai, Cloudflare AI Gateway . This distinction has direct implications for data sovereignty, compliance, and operational burden — a theme explored throughout this report. Key Findings on API Key Granularity The deepest analysis in this report examines how granular access control actually is across projects: Per-model scoping : LiteLLM, Bifrost, TensorZero, OpenZiti, LLM Security Gateway, and TrueFoundry all support restricting keys to specific models or model patterns. IP allowlisting : Only Portkey geography/IP inbound rules and lazy-llm-proxy per-key IP allowlists provide this natively. Most projects rely on network-level controls VPC, firewall . Webhook-based validation : LM-Proxy Nayjest supports external HTTP service or Python function for custom key validation. Multi-tenancy isolation : Bifrost hierarchical org CRUD , TrueFoundry Virtual Accounts, per-provider RBAC , and LiteLLM team id with PostgreSQL-backed logical separation lead in multi-tenant depth. Physical/infra-level isolation requires self-hosting on-prem/VPC TrueFoundry, WSO2 . 1. Background and Context 1.1 What is an LLM Proxy? An LLM proxy also called an AI gateway, LLM router, or LLM middleware sits between client applications and LLM inference providers. Its core functions are: Protocol normalization : Translating between different provider API formats OpenAI’s /v1/chat/completions , Anthropic’s /v1/messages , Google’s Vertex AI, AWS Bedrock’s SigV4-signed requests into a single unified interface Access control : Managing which clients/teams/users can access which models and with what limits Cost governance : Tracking token usage, enforcing budgets, and providing cost attribution Reliability : Automatic failover, retry, load balancing, and circuit breaking Observability : Logging, metrics, tracing 1.2 Why API Key Granularity Matters In production environments, a single shared provider API key e.g., one OpenAI sk-... key used across all services creates several problems: No cost attribution : You cannot determine which team or service drove spending No isolation : A runaway script in one service burns the entire budget No audit trail : You cannot attribute a specific request to a user or application Security risk : If one key is compromised, all services are exposed No rate management : You cannot enforce per-service or per-user limits Virtual API keys solve this by providing a layer of indirection — the proxy accepts client-specific keys and maps them to upstream provider credentials internally. 1.3 Market Context: The LiteLLM Breach In March 2026, two significant events shook confidence in the LLM proxy ecosystem: - A supply-chain attack via compromised PyPI versions 1.82.7, 1.82.8 of LiteLLM that deployed a credential-stealing payload through a poisoned trivy-action GitHub Action 1 - The revelation that LiteLLM’s SOC 2 certification was based on fabricated compliance reports from Delve, a YC-backed startup later exposed for producing 500+ structurally identical audit reports 2 These events accelerated evaluation of alternatives with better security postures compiled languages, auditable builds and verified compliance credentials. 2. Detailed Project Analysis 2.1 LiteLLM BerriAI — The Incumbent | Attribute | Value | |---|---| | GitHub | github.com/BerriAI/litellm | | Stars | ~40,000 | | Language | Python | | License | MIT | | Self-hosted | Yes Docker, PyPI | | Providers | 100+ | API Key / Access Control Features: Virtual keys : Create verification tokens that act as client-facing API keys. Each key can be scoped to a specific user or team Budget management : Per-key personal budgets, per-team shared budgets, and per-team-member individual limits within a team’s shared budget 3 Rate limiting : Configurable RPM requests per minute and TPM tokens per minute per key Admin UI : Web dashboard for managing models, keys, teams, and budgets Team management : Keys can be assigned to teams with team id , enabling hierarchical budget structures Cost tracking : Automatic mapping of model-specific token pricing; cost data exposed at key, user, and team level 4 Guardrails : Input/output content filtering enterprise tier Load balancing : Distribute across multiple deployments of the same model Limitations: - Python GIL limits concurrency under high traffic - PostgreSQL-backed logging degrades after ~1M records - Enterprise features SSO, RBAC, team budgets gated behind paid license - Supply-chain vulnerability in March 2026 2.2 Bifrost Maxim AI — The High-Performance Challenger | Attribute | Value | |---|---| | GitHub | github.com/maximhq/bifrost | | Stars | ~5,600 | | Language | Go 75% , TypeScript 17% | | License | Apache 2.0 | | Self-hosted | Yes Docker, npx, Helm | | Providers | 23+ | API Key / Access Control Features: Virtual Keys : Create separate keys for different applications with independent budgets, rate limits, and access controls 5 Hierarchical Budgets : Four-tier hierarchy — Customer → Team → Virtual Key → Provider Config. Per-key rate limits, model restrictions, and spend caps enforced at the proxy layer SSO Integration : Google and GitHub SSO available in open-source version not paywalled RBAC : Role-based access control for admin, team, and user roles Vault Support : HashiCorp Vault integration for secure API key management OIDC User Provisioning : OAuth 2.0 / OIDC login with background directory sync for teams, roles, and business units 6 Per-provider budgets : Budgets scoped per virtual-key top level and per provider, wired from model configs Business unit CRUD : Full create/read/update/delete for organizational units Performance: - 11µs overhead at 5,000 RPS vs. LiteLLM’s ~8ms P95 at 1,000 RPS - 54× faster at P99 latency 2.3 Portkey AI Gateway | Attribute | Value | |---|---| | GitHub | github.com/Portkey-AI/gateway | | Stars | ~12,000 | | Language | TypeScript 96% | | License | MIT open-source gateway | | Self-hosted | Yes Docker, Node.js, Cloudflare Workers | | Providers | 1,600+ | API Key / Access Control Features: Virtual Keys : On-the-fly virtual key generation for secure key management Role-Based Access Control RBAC : Granular access control for users, workspaces, and API keys 7 Secure Key Vault : Store LLM provider keys in Portkey’s vault; manage access with virtual keys Secret References : Reference secrets stored in AWS Secrets Manager, Azure Key Vault, or HashiCorp Vault — the gateway fetches credentials at runtime without storing them 8 MCP Gateway : Centralized control plane for MCP servers with authentication, access control per-team and per-user , identity forwarding email, team, roles , and observability Access Control & Inbound Rules : Control which IPs and geographies can connect to deployments PII Redaction : Automatically remove sensitive data from requests Enterprise Features: - SOC2, HIPAA, GDPR, CCPA compliance - Professional support with feature prioritization 2.4 Helicone | Attribute | Value | |---|---| | GitHub | github.com/Helicone/helicone | | Stars | ~5,800 | | Language | TypeScript 91% | | License | Apache 2.0 | | Self-hosted | Yes Docker, Helm | | Providers | 100+ | API Key / Access Control Features: AI Gateway : Single endpoint https://ai-gateway.helicone.ai accepting a Helicone API key, routing to 100+ models Unified API Key : One key provides access across all providers Self-hosted deployment : Docker Compose and Helm charts available Observability-first : Request logging, cost tracking, latency monitoring — the proxy is primarily an observability layer with light gateway features Limitations: - Lighter on routing and governance compared to full-featured gateways - Self-hosting noted as “not recommended” for manual deployment - Enterprise features compliance, governance remain thinner than enterprise-focused alternatives 2.5 VoidLLM | Attribute | Value | |---|---| | GitHub | github.com/voidmind-io/voidllm | | Stars | ~104 | | Language | Go 80% | | License | BSL 1.1 source-available | | Self-hosted | Yes Docker, Helm, binary | | Providers | OpenAI, Anthropic, Azure, Ollama, vLLM, custom | API Key / Access Control Features: Virtual Keys : Organization-wide access control with org/team/user scoping and RBAC 9 RBAC Hierarchy : Org → Team → User → Key hierarchy with 4 roles Rate Limits : Per-key, per-team, per-org request limits RPM/RPD , most-restrictive-wins across levels Token Budgets : Daily/monthly token budgets with real-time enforcement Usage Tracking : Tokens, cost, duration, TTFT time-to-first-token per request Model Aliases : Clients call default , proxy routes anywhere — decouples client code from provider MCP Gateway : Proxy external MCP servers with access control and session management; Code Mode WASM-sandboxed JS for multi-tool orchestration Zero-Knowledge Architecture : By design, never stores or logs prompt/response content. Only metadata who, what model, how many tokens is tracked Pricing Model: - Free tier: Core features - Pro $49/mo : Cost reports, usage export, extended retention - Enterprise $149/mo : SSO/OIDC, per-org SSO, auto-provisioning, audit logs, OpenTelemetry 2.6 OmniRoute | Attribute | Value | |---|---| | GitHub | github.com/diegosouzapw/OmniRoute | | Stars | ~5,700 | | Language | TypeScript 100% | | License | MIT | | Self-hosted | Yes Docker, npm | | Providers | 160+ | API Key / Access Control Features: Dedicated API Key Manager : /dashboard/api-manager page for managing API keys with create, delete, and permissions management 10 Encryption at Rest : Credentials encrypted with AES-256-GCM Authentication Methods : OAuth, API Key, or Web Cookie Free Providers : 50+ providers with free tiers aggregated Multi-modal APIs : Text, image, audio support Key Differentiator: - Completely free and open-source with no cloud dependency — “No OmniRoute cloud sits in the request path” 2.7 LLM-API-Key-Proxy Mirrowel | Attribute | Value | |---|---| | GitHub | github.com/Mirrowel/LLM-API-Key-Proxy | | Stars | ~507 | | Language | Python 100% | | License | MIT proxy + LGPL-3.0 resilience library | | Self-hosted | Yes Docker, binary, source | | Providers | Via LiteLLM fallback | API Key / Access Control Features: Single PROXY API KEY : One API key for all clients; configured via environment variable or TUI Multi-provider Key Rotation : Automatic rotation across multiple provider keys with intelligent cooldowns Usage Tracking : Per-provider usage statistics persisted to disk Quota Viewer : Alpha feature for viewing quota windows and fair-cycle status Credential Management : Interactive TUI for managing API keys and OAuth credentials Architecture: - Two components: FastAPI proxy application + standalone Python resilience library rotator library - The resilience library is independently usable for intelligent key selection, deadline-driven requests, and automatic failover 2.8 Instawork llm-proxy | Attribute | Value | |---|---| | GitHub | github.com/Instawork/llm-proxy | | Stars | ~31 | | Language | Go 96% | | License | MIT | | Self-hosted | Yes Docker, binary | | Providers | OpenAI, Anthropic, Gemini, AWS Bedrock | API Key / Access Control Features: Per-user/API Key Rate Limiting : Experimental feature for request/token-based limits per user/API key/model/provider Token Estimation : Provisional token estimation with post-response reconciliation using X-LLM-Input-Tokens Circuit Breaker : Per-key circuit breaker that classifies upstream failures, retries transient errors, and emits a degraded-signal response Per-provider Rollup : Detects wholesale outages across multiple keys for the same provider Bypass Safety Valve : Callers without fallback can opt out of fast-fail via header Key Differentiator: - Minimalist design — “without all the extra stuff you don’t need” - AWS Bedrock transparent SigV4 passthrough clients sign with their own credentials - Comprehensive circuit breaker with per-model keying and provider rollup 2.9 LLM Gateway theopenco | Attribute | Value | |---|---| | GitHub | github.com/theopenco/llmgateway | | Stars | ~1,300 | | Language | TypeScript 95% | | License | AGPLv3 open-source + Enterprise | | Self-hosted | Yes Docker, unified container | | Providers | 210+ | API Key / Access Control Features: API Key Management : Unified API interface with authentication Usage Analytics : Track requests, tokens used, response times, and costs Team and Organization Management : Enterprise features paid Custom Provider Key Configurations : Enterprise tier Architecture: - Monorepo with separate apps: UI Next.js , API Hono , Gateway routing , Playground, Admin - PostgreSQL + Redis for data persistence 2.10 llm-budget-proxy InkByteStudio | Attribute | Value | |---|---| | GitHub | github.com/InkByteStudio/llm-budget-proxy | | Stars | ~0 | | Language | TypeScript 89% | | License | MIT | | Self-hosted | Yes Docker | | Providers | OpenAI only MVP | API Key / Access Control Features: Per-Key Token Budgets : Daily and monthly USD budgets per API key Rate Limiting : RPM and TPM limits with overrides by key pattern Model Downgrade : Automatic downgrade to cheaper models when approaching budget thresholds opt-in Cost Dashboards : Single-page Chart.js dashboard showing cost by key, over time, and budget status Alert Webhooks : Slack/Discord webhook notifications at 80% warn , 95% downgrade , 100% block Response Headers : Every response includes X-Request-Cost , X-Estimated-Cost , X-Budget-Remaining , X-Budget-Warning Key Differentiator: - Deliberately simpler than LiteLLM: single SQLite database, single Docker container, ~5-minute setup vs. ~30 minutes for LiteLLM 2.11 LM-Proxy Nayjest | Attribute | Value | |---|---| | GitHub | github.com/Nayjest/lm-proxy | | Stars | ~134 | | Language | Python 99% | | License | MIT | | Self-hosted | Yes pip, source | | Providers | OpenAI, Anthropic, Google AI, local PyTorch | API Key / Access Control Features: Virtual API Key Management : Proxy-level keys separate from upstream provider keys User Groups : Configurable groups with api keys lists and allowed connections restrictions OIDC Integration : Validate tokens from OpenID Connect providers Keycloak, Auth0, Okta as virtual API keys Custom API Key Validation : Extensible validator functions for custom authentication logic Rate Limiter Handler : Sliding window rate limiting scoped per api key, ip, connection, group, or global Extensible Middleware : Before/request handlers for auditing, header forwarding, and custom logic Configuration: - TOML/YAML/JSON/Python config files api key check can reference a Python function or external HTTP service 2.12 LLM Security Gateway TerminalsandCoffee | Attribute | Value | |---|---| | GitHub | github.com/TerminalsandCoffee/llm-security-gateway | | Stars | ~1 | | Language | Python 93% | | License | Not specified | | Self-hosted | Yes Docker, AWS Lambda | | Providers | OpenAI, AWS Bedrock | API Key / Access Control Features: Per-Client Authentication : X-API-Key header with constant-time comparison hmac.compare digest Per-Client Configuration : JSON file or DynamoDB backend for per-client settings Rate Limiting : Sliding window counter, per-client RPM, returns X-RateLimit- headers Model Allowlist : Per-client model restrictions empty = all allowed AWS Lambda Deployment : Terraform-managed infrastructure with CI/CD Security Pipeline: - Authentication → 2. Rate Limiting → 3. Model Allowlist → 4. Injection Detection 20 patterns, 4 categories → 5. PII Detection SSN, CC, email, phone, IPv4 → 6. Forward → 7. Response Scan 2.13 Paperclip paperclipai | Attribute | Value | |---|---| | GitHub | github.com/paperclipai/paperclip | | Stars | ~69,700 | | Language | TypeScript 98% | | License | MIT | | Self-hosted | Yes npx, Docker | | Scope | AI agent orchestration not LLM proxy | API Key / Access Control Features: Agent API Keys : Short-lived run JWTs for agent execution Per-Agent Monthly Budgets : Token and cost tracking by company, agent, project, goal, issue, provider, and model Budget Hard Stops : Overspend pauses agents and cancels queued work Org Chart Governance : Board approval workflows, execution policies with review/approval stages Multi-Company Isolation : Complete data isolation between organizations Note: Paperclip is not an LLM proxy — it’s an orchestration/control plane for AI agents. It manages who works on what and how spend is capped, but delegates the actual LLM routing to underlying tools Claude Code, Codex, HTTP adapters . 2.14 WSO2 AI Gateway | Attribute | Value | |---|---| | GitHub | github.com/wso2/wso2-envoy-ai-gateway | | Language | Go 89% | | License | Apache 2.0 | | Self-hosted | Yes Docker, Kubernetes | | Providers | OpenAI, Anthropic, Google Vertex, Azure AI, AWS Bedrock, Mistral | API Key / Access Control Features: Token-Based Rate Limiting : Calibrated to how LLMs actually charge per-token, not per-request MCP Governance : Convert REST APIs into MCP-compatible servers, proxy external MCP servers with centralized policy enforcement PII Masking : Scrub sensitive data before prompts leave the network SOC 2 Type 2 + ISO 27001 : Verified compliance credentials Key Differentiator: - Built on Envoy Proxy — established Kubernetes-native deployment patterns - Unbundled adoption: can deploy just the AI Gateway without the full platform 2.15 Kong AI Gateway | Attribute | Value | |---|---| | Platform | Kong API Management Platform | | Language | Go core + Lua plugins | | License | Partial open-source core | | Self-hosted | Yes | | Providers | Via plugin architecture | API Key / Access Control Features: Token-Based Rate Limiting : Enterprise tier RBAC & Audit Logs : Enterprise API governance AI MCP Proxy Plugin : Dedicated MCP traffic governance OAuth2 Plugins : For MCP authentication Key Differentiator: - Existing enterprise API management penetration — natural adoption path for teams already running Kong 2.16 TensorZero — The ML-Optimized Rust Gateway | Attribute | Value | |---|---| | GitHub | github.com/tensorzero/tensorzero | | Stars | ~11,400 | | Language | Rust | | License | Apache 2.0 | | Self-hosted | Yes Docker, K8s/Helm examples | | Providers | 15+ direct, any OpenAI-compatible via extension | API Key / Access Control Features: Custom API Keys : Create and manage custom API keys for different clients or services tensorzero.com/docs/operations/set-up-auth-for-tensorzero Tag-based rate limits : Granular scopes — rate limits apply per tag e.g., per-project, per-team, per-environment Usage/cost tracking : Per-tag attribution for cost and usage analytics Structured inference with schema validation : Enforces input/output schemas, data used for downstream optimization GitOps orchestration : Prompts, models, parameters, tools, experiments managed via version-controlled config Performance: - <1ms P99 latency at 10,000+ QPS Rust - LiteLLM @ 100 QPS adds 25-100x more latency than TensorZero @ 10,000 QPS tensorzero.com/docs/gateway Differentiator: - Combines gateway + observability + evaluation + optimization + A/B testing in one platform - “Autopilot” feature: automated AI engineer that analyzes observability data, sets up evals, optimizes prompts/models, runs A/B tests - Team: includes Rust compiler maintainer, J.P. Morgan AI Research VP, Columbia postdoc 2.17 TrueFoundry AI Gateway — Enterprise Control Plane | Attribute | Value | |---|---| | URL | truefoundry.com/ai-gateway | | Stars | N/A enterprise SaaS + self-hosted | | Language | Go internal | | License | Proprietary self-hosted available | | Self-hosted | Yes SaaS, VPC, on-prem, air-gapped | | Providers | 250+ models | API Key / Access Control Features: Virtual Accounts VAT : Non-human production identity — gateway-managed keys that map to real provider credentials centrally truefoundry.com/blog/ai-governance-audit-enterprise-llm-gateway Personal Access Tokens PATs : For development workflows RBAC : Scoped per provider account; policy-as-code with Cedar and OPA engines at MCP-tool boundary Rate-limit & budget rules : Expressed as YAML with per-user/per-team/per-model/per-metadata scopes; first-match-wins evaluation Sliding-window enforcement : Twelve 5-second buckets summed across 60-second window; bursty but strict truefoundry.com/docs/ai-gateway/ratelimiting Audit-grade traces : Every request, rate-limit decision, guardrail outcome, and fallback hop lands on the same trace ID x-tfy-trace-id , exportable via OpenTelemetry to SIEM Data residency routing : Region-aware routing keeps regulated data within jurisdiction; provider restrictions block data classes from certain providers Compliance: - SOC 2 Type 2, HIPAA, ITAR certified - Recognized in Gartner Market Guide for AI Gateways 2026 truefoundry.com/ai-gateway Differentiator: - Most complete governance feature set: virtual keys + RBAC + policy-as-code + compliance-grade audit logs + residency routing - Deployment flexibility: SaaS, VPC, on-prem, air-gapped - GPU orchestration and fractional GPU support built in 2.18 RelayPlane — The npm-Native Cost-Intelligence Proxy | Attribute | Value | |---|---| | GitHub | github.com/RelayPlane/proxy | | Stars | ~200 | | Language | Node.js / TypeScript | | License | MIT | | Self-hosted | Yes npm install -g | | Providers | 11+ providers + Ollama | API Key / Access Control Features: Not a virtual-key system : Uses your own provider keys directly Anthropic, OpenAI, Google, xAI, Moonshot Cost intelligence proxy : Classifies tasks using heuristics token count, prompt patterns, keyword matching and routes to cheapest capable model Budget enforcement : In free tier; configurable cascade fallback when models hit limits Dashboard : Tracks every request, shows where money goes Key Differentiator: - npm-native: npm install -g @relayplane/proxy — 30 seconds, no Docker, no Python env, no Go toolchain - Local-first architecture: runs in-process with your app; zero network-hop overhead - MCP server support shipped in v1.0.0 - Designed for Claude Code / Cursor / coding agent workflows 2.19 OpenZiti LLM Gateway — Zero-Trust Proxy | Attribute | Value | |---|---| | GitHub | github.com/openziti/llm-gateway | | Stars | ~65 | | Language | Go 100% | | License | Apache 2.0 | | Self-hosted | Yes single binary, no DB | | Providers | OpenAI, Anthropic, Ollama, vLLM, llama-server, SGLang | API Key / Access Control Features: Virtual API Keys : Generated via llm-gateway genkey ; stored in config YAML Model-level restrictions : Keys can be restricted to specific models using glob patterns allowed models: " " or "gpt-4o" github.com/openziti/llm-gateway Client authentication : Authorization: Bearer