Nexus – open-source AI gateway for enterprise LLM traffic Nexus, an open-source AI gateway, now intercepts enterprise large language model traffic at the SDK, network, and OS layers to enforce compliance, audit, and control policies across all LLM applications. The gateway normalizes requests from any provider to a canonical OpenAI format, supports 20 provider adapters, and offers features including multi-axis quotas, semantic caching, PII detection, and role-based access control. Enterprises can deploy the three independent componentsβ€”AI Gateway, Compliance Proxy, and Desktop Agentβ€”to secure and govern LLM traffic without requiring application code changes. Make AI safe to use across the enterprise. Nexus Gateway intercepts enterprise LLM traffic at three layers and runs all of it through one compliance engine, one audit pipeline, and one control plane. | Mode | Where it intercepts | Code | |---|---|---| πŸ”‘ AI Gateway | SDK layer β€” virtual keys on /v1/chat/ , /v1/responses , /v1/embeddings , /v1/messages | packages/ai-gateway/ | 🌐 Compliance Proxy | Network layer β€” transparent TLS bump CONNECT + MITM | packages/compliance-proxy/ | πŸ’» Desktop Agent | OS layer β€” macOS / Linux / Windows builds all in development, awaiting QA | packages/agent/platform/{darwin,linux,windows}/ | The three pipes are independent: AI Gateway, Compliance Proxy, and Agent each run the full hooks pipeline on their own traffic packages/shared/policy/hooks/ , plus the per-service compliance pipeline β€” e.g. packages/agent/internal/compliance/pipeline.go . The Agent always egresses directly to the upstream provider β€” it does not care whether enterprise network policy then routes that traffic through the Compliance Proxy. When it does β€” Agent stamps an Ed25519-signed X-Nexus-Attestation header on the outbound request E60, packages/agent/internal/identity/attestation/ . The Compliance Proxy peeks this header before the TLS bump packages/shared/transport/tlsbump/forward handler.go:119 ; if the signature verifies, the CONNECT becomes pure passthrough β€” no MITM, no hooks, no audit on that flow, since the Agent already ran them. Applications speak the OpenAI SDK. Nexus normalises every request to a canonical OpenAI shape, then translates wire format on the way to the actual provider. Shipped adapter codecs today packages/ai-gateway/internal/providers/specs/ : First-class codecs 11 : openai , anthropic , gemini , vertex , azure , bedrock , cohere , minimax , glm , replicate , voyage . OpenAI-compatible passthrough 9 : deepseek , moonshot , mistral , groq , fireworks , together , perplexity , xai , huggingface β€” all under packages/ai-gateway/internal/providers/specs/compat/ . Reasoning tokens, function calls, vision inputs, structured outputs are carried through the translation. Adding a new provider is a documented procedure under .claude/skills/add-provider-adapter/ . Exact-match response cache β€” Valkey-backed, Redis-wire-compatible. Provider-native cache accounting β€” surfaces Anthropic cached tokens and Gemini cachedContentTokenCount in billing when the provider reports them. Semantic vector cache via the valkey-search module β€” packages/ai-gateway/internal/cache/semantic/ lookup, writer, client, circuit breaker, singleflight, poison guard, index lifecycle . In-flight singleflight β€” concurrent identical prompts fold into one upstream call. Multi-axis quotas β€” per organization, per virtual key, per provider, per model. Each axis has its own budget and sliding-window enforcement. Token-based or USD-based budgets. Hard limits and soft limits β€” soft fires an alert; hard rejects with 429. Real-time accounting β€” counters update on every traffic event, no batch lag. Routing strategies in packages/ai-gateway/internal/routing/strategies/ : single , fallback , loadbalance , conditional , absplit , policy , smart . PII detection Β· data classification Β· keyword filtering Β· content safety Β· rate limiting Β· IP allowlists Β· request-size validation Β· webhook forwarders Β· per-stage audit request hooks and response hooks recorded independently Β· body capture 256 KiB inline + spillstore for the rest, see packages/shared/storage/spillstore/ Β· SIEM forwarder packages/compliance-proxy/internal/siem/ Β· three-tier kill switch Β· emergency passthrough bypassHooks / bypassCache / bypassNormalize . Chat Β· Embeddings Β· Structured outputs Β· Function / tool calling Β· Vision input Β· Reasoning tokens. Multimodal epic E62 in development. IAM β€” RBAC + ABAC with an NRN resource model packages/shared/identity/iam/ . Virtual keys with per-key model scope. OIDC federation with JIT user provisioning packages/control-plane/internal/identity/authserver/login/oidc.go , JIT flag in scim store.go . Organization / project hierarchy with per-org quota. Credential vault β€” AES-256-GCM packages/control-plane/internal/platform/crypto/aes gcm.go , packages/ai-gateway/internal/credentials/decrypt/decrypt.go with key rotation. Agent fleet management β€” Hub CA, Thing-based config sync, drift detection. Five Go services + one React control console. The diagram below shows only the traffic plane β€” the three independent intercept pipes and where each one egresses. Control plane Hub-centric and storage are summarized in the component table immediately after. flowchart TB SDK "SDK app