Three weeks ago I was benchmarking GPT-4o against a local Llama model. I was copying prompts from a real support ticket database to make the test realistic. Midway through the run I glanced at the terminal and saw this in the logs:
prompt="Hi, my name is Sarah Johnson, my account number is 4532-1234-5678-9012..."
provider=cloud
model=gpt-4o
A real customer's name. A real credit card number. Already sent to OpenAI.
I had not noticed because the benchmark UI just showed a token count, not the actual prompt content. The PII was in the data. I had forgotten to sanitise it. OpenAI's API terms say they don't train on API data, but that's not the point β the data left my infrastructure. Under GDPR, that's a potential breach.
I spent the rest of that weekend building a firewall so it could never happen again. This post is the full story of what I built, how it works, and how you can run it in one command.
The code is at github.com/sochaty/llm-governance-engine β tag governance-post-1
.
Every LLM observability tool I have used β LangSmith, Helicone, Arize Phoenix β works the same way: it records what happened after the fact. You get a dashboard, a trace, a cost breakdown. None of them stop the request.
That distinction matters enormously under GDPR, HIPAA, and the EU AI Act. "We logged that PII was sent" is not a compliance posture. "PII was blocked before it left the building" is.
By the end of this post you will have:
Everything runs with docker compose up
.
The key insight is where enforcement happens: before the model call, not after.
User Prompt
β
βΌ
FastAPI /benchmark/stream
β
βββ enforce_governance_policy() β Presidio scan + policy evaluation
β β
β βββ PII detected + cloud model β HTTP 403 (prompt never sent)
β βββ Safety score low β warn + log + continue
β βββ All rules pass β verdict returned to endpoint
β
βββ LLMOrchestrator.get_streaming_response()
β β
β βββ OpenAI / Groq / Google / Anthropic (cloud)
β βββ Ollama (local)
β
βββ AuditService β PostgreSQL
The enforce_governance_policy
function is a FastAPI Depends()
β injected into the streaming endpoint. If a blocking rule fires, it raises HTTP 403
before the orchestrator is even called. The prompt never touches the wire.
The entire governance model is a YAML file. No code changes, no restarts β edit the file, POST /api/v1/policies/reload
, rules are live.
version: "1.0"
name: "default"
rules:
- id: pii-cloud-block
name: "Block PII from cloud models"
condition: pii_detected
threshold: 0.7 # Presidio confidence β₯ 0.7 triggers this rule
models: [cloud, gpt-4o]
action: block # returns HTTP 403
severity: critical
webhook_url: null # set to your Slack URL to get alerted
- id: low-safety-warn
name: "Warn on low safety score"
condition: safety_score_below
threshold: 0.5
action: warn # logs + audits, passes through
severity: medium
- id: pii-local-alert
name: "Alert on PII sent to local models"
condition: pii_detected
threshold: 0.85
models: [local]
action: alert # fires webhook, does not block
severity: high
Four conditions: pii_detected
, safety_score_below
, cost_exceeds
, model_is
.
Three actions: block
(HTTP 403), warn
(audit + continue), alert
(webhook + continue).
Starter templates are shipped in the repo for GDPR (policies/gdpr.yaml
) and HIPAA (policies/hipaa.yaml
).
Presidio is Microsoft's open-source PII detection library. It runs locally β no API call, no data leaving your machine.
It detects 50+ entity types out of the box: PERSON
, EMAIL_ADDRESS
, CREDIT_CARD
, US_SSN
, PHONE_NUMBER
, IBAN_CODE
, IP_ADDRESS
, and more. It uses a combination of regex patterns, checksums, and a spaCy NLP model for name recognition.
The scan returns a confidence score per entity. The policy engine compares that score against the rule's threshold
. An entity with 0.95 confidence on CREDIT_CARD
and a threshold of 0.7 triggers the pii-cloud-block
rule.
from presidio_analyzer import AnalyzerEngine
class AuditService:
def __init__(self):
self.analyzer = AnalyzerEngine()
def scan_for_pii_details(self, text: str) -> ScanResult:
results = self.analyzer.analyze(text=text, language="en")
detected = len(results) > 0
entities = [
EntityResult(
entity_type=r.entity_type,
confidence=r.score,
start=r.start,
end=r.end,
)
for r in results
]
max_confidence = max((r.score for r in results), default=0.0)
return ScanResult(
detected=detected,
entities=entities,
max_confidence=max_confidence,
)
The safety score is calculated separately β it is a 0.0β1.0 measure that combines PII confidence, entity density, and sensitive keyword presence. A score below 0.5 triggers the low-safety-warn
rule.
The engine follows a Chain of Responsibility pattern. Each rule evaluates the GovernanceContext
independently:
@dataclass
class GovernanceContext:
prompt: str
provider: str
model_id: str
pii_detected: bool
pii_entity_types: List[str]
pii_max_confidence: float
safety_score: float
estimated_prompt_cost_usd: float
class PolicyVerdict(BaseModel):
passed: bool
violated_rules: List[ViolatedRule] = []
blocking_rule: Optional[ViolatedRule] = None
warnings: List[str] = []
The DefaultPolicyEngine.evaluate()
iterates all rules in order. Block rules short-circuit. Warn and alert rules accumulate into the verdict. The verdict is returned to the FastAPI dependency, which raises HTTP 403
if blocking_rule
is set.
This is the part that makes everything composable. One line wires the entire governance stack into any endpoint:
@router.get("/stream")
async def stream_benchmark(
verdict: PolicyVerdict = Depends(enforce_governance_policy),
db: AsyncSession = Depends(get_db),
):
...
The dependency itself:
async def enforce_governance_policy(
prompt: Annotated[str, Query(min_length=1)],
provider: Annotated[str, Query(pattern="^(cloud|local)$")] = "cloud",
db: AsyncSession = Depends(get_db),
) -> PolicyVerdict:
engine = get_policy_engine()
audit = _get_audit_service()
scan = audit.scan_for_pii_details(prompt)
context = GovernanceContext(
prompt=prompt,
provider=provider,
model_id="gpt-4o" if provider == "cloud" else "llama3.2:latest",
pii_detected=scan.detected,
pii_entity_types=[e.entity_type for e in scan.entities],
pii_max_confidence=scan.max_confidence,
safety_score=audit.calculate_safety_score(prompt),
estimated_prompt_cost_usd=(len(prompt.split()) * 0.00003)
if provider == "cloud" else 0.0,
)
verdict = engine.evaluate(context)
for violation in verdict.violated_rules:
webhook_url = _get_webhook_url(engine, violation.rule_id)
await _record_violation(db, violation, context, webhook_url)
if not verdict.passed and verdict.blocking_rule:
br = verdict.blocking_rule
raise HTTPException(
status_code=403,
detail={
"error": "governance_violation",
"rule_id": br.rule_id,
"rule_name": br.rule_name,
"severity": br.severity,
"message": br.message,
},
)
return verdict
Every violation β blocked or not β is persisted to policy_violations
in PostgreSQL before the function returns. Webhook delivery is fire-and-forget via asyncio.create_task()
so it never adds latency to the response path.
When a rule fires with a webhook_url
, a CloudEvents-compatible payload is POSTed:
{
"specversion": "1.0",
"type": "com.governance.policy.violation",
"source": "llm-governance-engine",
"id": "uuid",
"time": "2026-06-19T09:00:00Z",
"data": {
"rule_id": "pii-cloud-block",
"rule_name": "Block PII from cloud models",
"severity": "critical",
"action": "block",
"message": "PII detected (CREDIT_CARD, confidence=0.95) on cloud provider",
"provider": "cloud",
"model_id": "gpt-4o"
}
}
Three delivery attempts with exponential backoff. Slack, Teams, and PagerDuty all accept this payload natively via their incoming webhook integrations.
git clone https://github.com/sochaty/llm-governance-engine
git checkout governance-post-1
cp .env.example .env
docker compose up
Dashboard β http://localhost:4200
API docs β http://localhost:8000/docs
Pull a local model to enable the side-by-side comparison:
curl -X POST http://localhost:11434/api/pull -d '{"name":"llama3.2:latest"}'
Trigger your first governance block:
Open the dashboard, type a prompt containing a fake SSN β My SSN is 123-45-6789
β select the Cloud provider and hit Run. You will get a red Governance Violation
banner instead of a response. The prompt never reached GPT-4o.
Open http://localhost:8000/api/v1/policies/violations
to see the audit record of the block.
Every inference β blocked or not β is stored in PostgreSQL:
| Field | Example |
|---|---|
prompt (preview) |
|
| "My SSN is 123-45..." | |
provider |
|
| cloud | |
model_name |
|
| gpt-4o | |
pii_detected |
|
| true | |
safety_score |
|
| 0.12 | |
latency_ms |
|
| 0 (blocked before model) | |
estimated_cost |
|
| $0.0000 | |
version_tag |
|
| openai/gpt-4o |
The Audit Vault page in the dashboard is filterable by prompt, provider, and PII flag. Every row has a "Generate Report" button that exports a PDF β useful when a compliance officer asks for evidence.
The orchestrator supports five provider types with a single interface:
| Provider | How it connects |
|---|---|
| OpenAI | |
AsyncOpenAI β native |
|
| Groq | AsyncOpenAI(base_url="https://api.groq.com/openai/v1") |
| Google Gemini | AsyncOpenAI(base_url="https://generativelanguage.googleapis.com/v1beta/openai") |
| Anthropic | Lazy import anthropic β separate streaming path |
| Ollama (local) | AsyncOpenAI(base_url="http://ollama-service:11434/v1", api_key="ollama") |
API keys are stored in PostgreSQL (Fernet-encrypted) and resolved live on every request via settings_service.get()
. Change a key in the Settings UI β no restart needed, effective on the next request.
The codebase is production-ready for single-tenant use. The roadmap from here:
faithfulness_score
populates in the audit log 2β3 seconds after the benchmark completes.The incident that started this β a real customer's credit card number sent to GPT-4o because I forgot to sanitise a test dataset β took about 30 seconds to happen and would have taken weeks to untangle from a compliance perspective.
The fix took a weekend. It should have existed before the first prompt was ever sent.
Full code: github.com/sochaty/llm-governance-engine
Reproduce this post exactly: git checkout governance-post-1
PRs and issues welcome. If you build a custom Presidio recogniser for your domain (medical records, legal documents, financial instruments), I would love to include it in the default policy templates.
All my writing lives at
[blogs.sourishchakraborty.com]β subscribe there for future posts.