cd /news/ai-tools/totra-open-source-llm-gateway-with-g… · home topics ai-tools article
[ARTICLE · art-23029] src=github.com pub= topic=ai-tools verified=true sentiment=↑ positive

ToTra – open-source LLM gateway with GDPR/EU AI Act compliance

ToTra, an open-source AI gateway written in Go, now provides GDPR and EU AI Act compliance for any large language model with a single line of code change. The platform enforces per-user quota limits, blocks personally identifiable information across 18 languages at the network edge, and generates immutable audit logs before any data reaches providers like OpenAI or Anthropic. Organizations can self-host the gateway to maintain full control over their API keys and infrastructure while adding cost tracking and chargeback reporting.

read8 min publishedJun 6, 2026

AI Gateway & Governance Platform

Open-source LLM proxy written in Go. Add quota enforcement, PII blocking, cost tracking, and compliance to any LLM in one line of code.

Quick Start · Integration Guide · Features · Architecture · Gateway Docs · Admin API · Discussions

ToTra is an open-source AI gateway and governance platform that sits in front of any LLM provider.

Point your existing apps at ToTra instead of OpenAI, Anthropic, or any other provider — and instantly get:

Quota enforcement— per-user and per-team hard budget caps** PII blocking**— 18 language groups scanned at the edge before any data leaves your network** Cost tracking**— per-user, per-team, per-model token and USD spend with chargeback reports** Compliance**— GDPR workflows, EU AI Act checklist, hash-chained immutable audit log** Zero code changes**— 100% OpenAI-compatible; swap one line in your config

flowchart LR
    A["🖥️ Your App\n(OpenAI SDK / curl\n/ LangChain)"] -->|"1 · API request"| B

    subgraph B["ToTra Gateway  :8080"]
        direction TB
        B1["🔑 Auth & API Key"]
        B2["📊 Quota Check\n(per user / team)"]
        B3["🔒 PII Scan\n(18 languages)"]
        B4["⚡ Semantic Cache"]
        B5["🔀 Route & Load Balance"]
        B1 --> B2 --> B3 --> B4 --> B5
    end

    B -->|"2 · forward request"| C["☁️ LLM Providers\nOpenAI · Anthropic\nGemini · Mistral · Azure\nBedrock · Ollama"]
    C -->|"3 · response"| A

    B -->|"4 · usage events"| D

    subgraph D["ToTra Admin  :8081"]
        direction TB
        D1["💸 Cost Tracking"]
        D2["📋 Compliance & Audit"]
        D3["🔔 Budget Alerts"]
    end

    D --> E["📊 Dashboard  :3000\nAdmin Console · Reports\nEmployee Self-Service"]
  • 🚀 Written in Go— < 2 ms p95 overhead. Native binary, no Python runtime, no warm-up. - 🔒 PII blocked at the edge— email, IDs, credit cards, health records across 18 language groups. Sensitive data is redacted before it ever reaches an LLM. - 💸 Hard budget caps— requests over limit get429

before touching any provider. Real-time Slack / webhook alerts. - 📋 Compliance out of the box— GDPR data-subject workflows, EU AI Act checklist, and an immutable hash-chained audit log on every request. - 📊 Finance-ready reporting— department chargeback CSV, budget forecasts, spend anomaly detection. - 🏠 Self-hosted— your keys, your infrastructure, your data. No external dependency.

Prerequisites: Docker + Docker Compose

git clone https://github.com/SugaC-275/ToTra.git
cd ToTra
cp .env.example .env          # fill in your provider API keys
docker-compose --profile app up -d --wait

Open ** http://localhost:3000** and sign in:

Field Value
admin@acme.com
Password totra123

Change default credentials immediately after first login via

Settings → Security.

One line change. Every other line of code stays the same.

Python (OpenAI SDK)

import openai

client = openai.OpenAI(api_key="sk-...")

client = openai.OpenAI(
    api_key="your-totra-api-key",      # issued from the ToTra admin panel
    base_url="http://your-totra-host:8080/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Node.js / TypeScript (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-totra-api-key",
  baseURL: "http://your-totra-host:8080/v1",
});

const response = await client.chat.completions.create({
  model="gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

curl

curl http://your-totra-host:8080/v1/chat/completions \
  -H "Authorization: Bearer your-totra-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o",
    openai_api_key="your-totra-api-key",
    openai_api_base="http://your-totra-host:8080/v1",
)

response = llm.invoke("Hello!")
print(response.content)

Once connected, every request is automatically routed through quota enforcement, PII scanning, semantic caching, and cost tracking.

🔒 PII Protection — 18 Language Groups

Every request body is scanned in real time before it reaches any LLM. Detected PII is redacted and the event is logged. Blocked requests return 422

.

Language Group Detected Types
Universal Email, credit card, IBAN, SWIFT/BIC, ICD medical codes
Chinese National ID, phone, bank account, unified credit code, securities account
English US SSN, phone, NI number, passport, driver's license, medical record number
Japanese My Number (個人番号), phone, postal code, health insurance number
Korean RRN (주민등록번호), phone, passport, business registration number
EU (14 countries) National IDs, tax numbers, social security — DE/FR/ES/IT/NL/PL/SE/PT/BE/CH/DK/FI/NO/AT
Arabic (GCC + MENA) National ID, Iqama, Emirates ID, QID, CIN, NIN, phone

Configure rules per team, per model, or globally in the admin panel.

💸 Cost & Spend Management

  • Per-user, per-team, per-model token and USD cost tracking Hard budget caps— requests over limit get429

before touching the provider- Configurable alert thresholds with Slack / Feishu / webhook notifications

  • Monthly budget forecasts based on current burn rate Department chargeback reports with CSV export for finance- Procurement analytics and ROI dashboards
  • Spend anomaly detection with automatic alerts
Dashboard → Cost → Reports → Export CSV

📋 Compliance & Audit

GDPR— data-subject export and deletion request workflows, configurable retention policies** EU AI Act**— compliance checklist with per-model status tracking** Immutable audit chain**— every request is hash-chained; the log cannot be tampered with** SIEM integration**— configurable webhook targets for security event forwarding- Data residency controls — keep all data on-premises or in a specific region

⚡ Gateway & Routing

OpenAI-compatible— drop-in replacement for the OpenAI API (/v1/chat/completions

,/v1/embeddings

, streaming)Anthropic-compatible— native Anthropic messages API support- Multi-provider routing — automatic fallback across providers and models Semantic cache— SimHash LSH deduplication; repeated prompts skip the LLM entirely- Prompt compression — reduce token spend on long context

  • Streaming proxy — full text/event-stream

support File pipeline— upload PDF / DOCX / PPTX → parse → chat in one API call- Rate limiting, IP allowlist, API-key authentication

🔐 Administration

  • JWT authentication + OIDC / SSO integration
  • Role-based access control (admin / employee)
  • User and team management with quota request / approval workflow
  • Model catalogue — enable, disable, and configure providers per team
  • Bot notifications — Slack, Feishu, custom webhooks
  • HR sync connector (CSV import) Agent session tracking— detects and terminates dead-loop agent sessions automatically
Provider Chat Embeddings Streaming Files
OpenAI (GPT-4o, o1, o3, o4)
Anthropic (Claude 3.5, 4)
Google Gemini
Mistral AI
Meta Llama (via Ollama)
Cohere Command
Azure OpenAI
AWS Bedrock
Local / Ollama
Any OpenAI-compatible endpoint

ToTra is written entirely in Go. The gateway adds < 2 ms overhead at p95 under production load.

Concurrency p50 p95 p99
10 VUs < 1 ms 2 ms 4 ms
50 VUs 1 ms 3 ms 8 ms
200 VUs 2 ms 6 ms 15 ms

Measured against a 100 ms mock upstream.

[Reproduce the benchmark →]

Your Apps  (OpenAI SDK / curl / LangChain / any HTTP client)
    │
    ▼
ToTra Gateway  :8080
    auth · quota · PII scan · policy · semantic cache · routing
    │
    ▼
OpenAI · Anthropic · Gemini · Mistral · Local Models
    │
    │ (usage events)
    ▼
ToTra Admin  :8081
    cost · compliance · budgets · audit trail · notifications
    │
    ▼
Dashboard  :3000
    admin console · department reports · employee self-service
Service Stack Port
gateway
Go 1.25 / Fiber 8080
admin
Go 1.25 / Fiber 8081
parser
Python 3.12 / FastAPI 8090
dashboard
React 19 / Vite 3000
postgres
PostgreSQL 16 5432
redis
Redis 7 6379
Cost Dashboard Department Reports
User Management Employee Self-Service
docker-compose up -d postgres redis

cd gateway   && go run .
cd admin     && go run .
cd parser    && uvicorn main:app --port 8090
cd dashboard && npm install && npm run dev

cd scripts/set-dev-passwords
POSTGRES_HOST=localhost POSTGRES_DB=totra \
POSTGRES_USER=totra POSTGRES_PASSWORD=totra_secret go run .

Default dev credentials: admin@acme.com

/ totra123

Copy .env.example

to .env

. Key variables:

Variable Description
POSTGRES_HOST/PORT/DB/USER/PASSWORD
PostgreSQL connection
JWT_SECRET
Shared secret for JWT signing
ENCRYPTION_KEY
32-byte hex key — admin credential store
GATEWAY_ENCRYPTION_KEY
32-byte hex key — gateway credential store
OPENAI_API_KEY
Your OpenAI key (set per provider)
ANTHROPIC_API_KEY
Your Anthropic key

See .env.example for the full list including Redis, SMTP, and notification settings.

make test

cd gateway   && go test ./...
cd admin     && go test ./...
cd dashboard && npm run test:run
cd parser    && pytest

We welcome contributions — bug fixes, new provider integrations, docs improvements, and feature requests.

git clone https://github.com/SugaC-275/ToTra.git
cd ToTra

make test
  • Fork the repo and create a branch from main

  • Make your change and add tests where relevant

  • Ensure make test

passes - Open a pull request

For larger features, open a Discussion first to align on direction.

MIT — free to use, self-host, fork, and modify.

── more in #ai-tools 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/totra-open-source-ll…] indexed:0 read:8min 2026-06-06 ·