{"slug": "governance-first-ai-gateway-for-teams-that-aren-t-ready-for-enterprise-tooling", "title": "Governance-first AI gateway for teams that aren't ready for enterprise tooling", "summary": "A developer has released Synapse AI Gateway, an open-source governance-first AI gateway designed for regulated teams that need audit trails and policy enforcement without waiting for enterprise procurement. The gateway enforces governance at the infrastructure level by binding API keys to system prompts, model allowlists, team identities, and rate limits, and includes DLP, hybrid routing, and immutable audit logging. It can be deployed in under five minutes using Docker Compose.", "body_md": "If you work in a regulated organisation, you have probably seen this play out: leadership wants AI in production, security wants an audit trail, and the team in the middle has two options. Either ship something fast with no governance — shadow tools, no DLP, no audit log — or wait twelve to eighteen months for an enterprise platform to get procured and approved. Neither is good.\n\nMost of the tools available to bridge that gap fall into one of three camps:\n\nI have been working on a small Apache-2.0 project called **Synapse AI Gateway** that aims at the space between those options. `docker compose up`\n\nbrings the whole stack — postgres, backend, admin console — and you have it running in under five minutes. Governance controls run on every inference request before they ever reach a model.\n\nGitHub: [synapse-ai-gateway/synapse-ai-gateway](https://github.com/synapse-ai-gateway/synapse-ai-gateway)\n\nThe design hinges on one decision: **every API key is bound at creation to a system prompt, a model allowlist, a team identity, and rate limits.** The team that gets a key for an approved HR-assistant use case cannot quietly repurpose that key for something else. They need a new key, which means a new approval.\n\nThat is the difference between governance-as-policy (a wiki page nobody reads) and governance-as-infrastructure (the gateway refuses the request). Policies do not enforce themselves. Controls in the request path do.\n\n```\nclient app\n   │\n   ▼\n┌─────────────────────────────────────────┐\n│ 1. auth + use-case scoping              │  →  inject system prompt, check model allowlist\n├─────────────────────────────────────────┤\n│ 2. prompt DLP                           │  →  block / redact / alert\n├─────────────────────────────────────────┤\n│ 3. hybrid routing (on-prem vs cloud)    │  →  classification decides backend\n├─────────────────────────────────────────┤\n│ 4. immutable audit log                  │  →  PostgreSQL append-only, SHA-256 hashes\n├─────────────────────────────────────────┤\n│ 5. response DLP + anomaly detection     │  →  webhook alerts\n└─────────────────────────────────────────┘\n   │\n   ▼\nLLM backend (Ollama, vLLM, OpenAI, Anthropic, Azure, Google)\n```\n\n**Layer 1** validates the key, injects the bound system prompt, checks the model allowlist. Invalid key or unapproved model returns 403 immediately.\n\n**Layer 2** is a built-in regex DLP engine. Three outcomes per category: `block`\n\n(HTTP 400), `redact`\n\n(sanitise and forward), `alert`\n\n(log and forward). Patterns live in a config file you can hot-reload. No external service required — this matters if your data sovereignty rules say PII cannot leave your perimeter even for a scan.\n\n**Layer 3** routes by data classification. A key tagged `sensitive`\n\nis allowed only to on-premises backends (Ollama, vLLM). A key tagged `non_sensitive`\n\ncan go to a cloud provider for higher capability. Consuming applications do not change — they always speak the OpenAI API.\n\n**Layer 4** writes one row per request to PostgreSQL: timestamp, team, model, token count, latency, DLP outcome, HTTP status. Prompt and response are stored as SHA-256 hashes, never plaintext. That preserves forensic hash-matching while protecting staff privacy.\n\n**Layer 5** scans responses on the way back out and surfaces anomalies (usage spikes, repeated DLP blocks, off-hours bursts) via webhook.\n\n```\ngit clone https://github.com/synapse-ai-gateway/synapse-ai-gateway\ncd synapse-ai-gateway\ndocker compose up -d\n```\n\nEvery setting has a working default, so that genuinely is the whole quick start for a local trial. Before exposing the stack beyond localhost, copy `.env.example`\n\nto `.env`\n\nand set real values for `JWT_SECRET`\n\n, `ADMIN_PASSWORD`\n\n, and `POSTGRES_PASSWORD`\n\n.\n\nAdmin console at `http://localhost:5173`\n\n. Log in, create a team in the UI, copy the API key (shown once), and you're ready:\n\n```\ncurl -X POST http://localhost:8080/v1/chat/completions \\\n  -H \"Authorization: Bearer <YOUR_TEAM_API_KEY>\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"llama3.2:latest\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"hello\"}]\n  }'\n```\n\nIt is OpenAI-API-compatible, so any OpenAI SDK works. Point `base_url`\n\nat `http://localhost:8080/v1`\n\nand pass the team key. The backend is fully transparent to the client.\n\nLiteLLM is excellent. It is the right tool for a different problem: routing across 100+ providers with maximum flexibility, at scale, with a team that already has DevOps capacity. Its per-worker footprint is correspondingly larger — appropriate for the high-throughput case it is built for.\n\nSynapse AI Gateway's full stack runs at ~113 MB at idle (backend 73 MB + postgres 32 MB + frontend 8 MB). The whole stack is three containers — postgres, backend, admin console — brought up by a single `docker compose up`\n\n. No Redis, no message broker, no Kubernetes required to get started. If you are just starting out and need a governance layer you can deploy in an afternoon, that footprint matters. If you are running millions of requests per day, it does not, and LiteLLM is the better choice.\n\nThe other meaningful difference is DLP. LiteLLM's DLP integrates with PromptGuard, Pangea, or Azure Content Safety — external services with their own pricing, accounts, and data flows. Synapse's DLP is built in. For an organisation whose data residency rules say PII does not leave the perimeter, \"built in\" is not a feature preference — it is a hard requirement.\n\nWorth saying clearly:\n\nSpecific facts, all verified in the repo:\n\n`/v1/chat/completions`\n\nThe `README.md`\n\nhas a deployment checklist for production: rotate every default secret, terminate TLS at a reverse proxy, use managed PostgreSQL, restrict CORS, review the DLP patterns for your jurisdiction.\n\nThe repo is Apache-2.0. There is a `CONTRIBUTING.md`\n\nwith DCO sign-off and a list of good first issues — DLP patterns for additional jurisdictions, additional backend adapters, a Helm chart. If there is a use case the current design does not cover, open an issue.\n\nGitHub: [synapse-ai-gateway/synapse-ai-gateway](https://github.com/synapse-ai-gateway/synapse-ai-gateway)\n\nIf your organisation is staring at the gap between \"ship AI now with no controls\" and \"wait two years for the enterprise platform,\" this is meant for you.", "url": "https://wpnews.pro/news/governance-first-ai-gateway-for-teams-that-aren-t-ready-for-enterprise-tooling", "canonical_source": "https://dev.to/zakaulhaque/governance-first-ai-gateway-for-teams-that-arent-ready-for-enterprise-tooling-2hdl", "published_at": "2026-06-15 17:30:12+00:00", "updated_at": "2026-06-15 18:07:08.203330+00:00", "lang": "en", "topics": ["ai-infrastructure", "ai-safety", "developer-tools", "generative-ai", "ai-policy"], "entities": ["Synapse AI Gateway", "Ollama", "vLLM", "OpenAI", "Anthropic", "Azure", "Google", "PostgreSQL"], "alternates": {"html": "https://wpnews.pro/news/governance-first-ai-gateway-for-teams-that-aren-t-ready-for-enterprise-tooling", "markdown": "https://wpnews.pro/news/governance-first-ai-gateway-for-teams-that-aren-t-ready-for-enterprise-tooling.md", "text": "https://wpnews.pro/news/governance-first-ai-gateway-for-teams-that-aren-t-ready-for-enterprise-tooling.txt", "jsonld": "https://wpnews.pro/news/governance-first-ai-gateway-for-teams-that-aren-t-ready-for-enterprise-tooling.jsonld"}}