cd /news/large-language-models/i-built-a-pii-firewall-for-llms-in-a… Β· home β€Ί topics β€Ί large-language-models β€Ί article
[ARTICLE Β· art-33459] src=dev.to β†— pub= topic=large-language-models verified=true sentiment=↑ positive

I Built a PII Firewall for LLMs in a Weekend (and Caught My Own Leak)

A developer built an open-source PII firewall for LLMs after accidentally sending a customer's credit card number to OpenAI during a benchmark test. The tool, called LLM Governance Engine, intercepts prompts before they reach the model, using Microsoft's Presidio library to detect and block sensitive data based on configurable YAML policies. It runs locally with Docker and supports actions like blocking, warning, or alerting for GDPR and HIPAA compliance.

read7 min views4 publishedJun 19, 2026

Three weeks ago I was benchmarking GPT-4o against a local Llama model. I was copying prompts from a real support ticket database to make the test realistic. Midway through the run I glanced at the terminal and saw this in the logs:

prompt="Hi, my name is Sarah Johnson, my account number is 4532-1234-5678-9012..."
provider=cloud
model=gpt-4o

A real customer's name. A real credit card number. Already sent to OpenAI.

I had not noticed because the benchmark UI just showed a token count, not the actual prompt content. The PII was in the data. I had forgotten to sanitise it. OpenAI's API terms say they don't train on API data, but that's not the point β€” the data left my infrastructure. Under GDPR, that's a potential breach.

I spent the rest of that weekend building a firewall so it could never happen again. This post is the full story of what I built, how it works, and how you can run it in one command.

The code is at github.com/sochaty/llm-governance-engine β€” tag governance-post-1

.

Every LLM observability tool I have used β€” LangSmith, Helicone, Arize Phoenix β€” works the same way: it records what happened after the fact. You get a dashboard, a trace, a cost breakdown. None of them stop the request.

That distinction matters enormously under GDPR, HIPAA, and the EU AI Act. "We logged that PII was sent" is not a compliance posture. "PII was blocked before it left the building" is.

By the end of this post you will have:

Everything runs with docker compose up

.

The key insight is where enforcement happens: before the model call, not after.

User Prompt
    β”‚
    β–Ό
FastAPI /benchmark/stream
    β”‚
    β”œβ”€β”€ enforce_governance_policy()   ← Presidio scan + policy evaluation
    β”‚       β”‚
    β”‚       β”œβ”€β”€ PII detected + cloud model β†’ HTTP 403 (prompt never sent)
    β”‚       β”œβ”€β”€ Safety score low β†’ warn + log + continue
    β”‚       └── All rules pass β†’ verdict returned to endpoint
    β”‚
    β”œβ”€β”€ LLMOrchestrator.get_streaming_response()
    β”‚       β”‚
    β”‚       β”œβ”€β”€ OpenAI / Groq / Google / Anthropic (cloud)
    β”‚       └── Ollama (local)
    β”‚
    └── AuditService β†’ PostgreSQL

The enforce_governance_policy

function is a FastAPI Depends()

β€” injected into the streaming endpoint. If a blocking rule fires, it raises HTTP 403

before the orchestrator is even called. The prompt never touches the wire.

The entire governance model is a YAML file. No code changes, no restarts β€” edit the file, POST /api/v1/policies/reload

, rules are live.

version: "1.0"
name: "default"

rules:
  - id: pii-cloud-block
    name: "Block PII from cloud models"
    condition: pii_detected
    threshold: 0.7          # Presidio confidence β‰₯ 0.7 triggers this rule
    models: [cloud, gpt-4o]
    action: block           # returns HTTP 403
    severity: critical
    webhook_url: null       # set to your Slack URL to get alerted

  - id: low-safety-warn
    name: "Warn on low safety score"
    condition: safety_score_below
    threshold: 0.5
    action: warn            # logs + audits, passes through
    severity: medium

  - id: pii-local-alert
    name: "Alert on PII sent to local models"
    condition: pii_detected
    threshold: 0.85
    models: [local]
    action: alert           # fires webhook, does not block
    severity: high

Four conditions: pii_detected

, safety_score_below

, cost_exceeds

, model_is

.

Three actions: block

(HTTP 403), warn

(audit + continue), alert

(webhook + continue).

Starter templates are shipped in the repo for GDPR (policies/gdpr.yaml

) and HIPAA (policies/hipaa.yaml

).

Presidio is Microsoft's open-source PII detection library. It runs locally β€” no API call, no data leaving your machine.

It detects 50+ entity types out of the box: PERSON

, EMAIL_ADDRESS

, CREDIT_CARD

, US_SSN

, PHONE_NUMBER

, IBAN_CODE

, IP_ADDRESS

, and more. It uses a combination of regex patterns, checksums, and a spaCy NLP model for name recognition.

The scan returns a confidence score per entity. The policy engine compares that score against the rule's threshold

. An entity with 0.95 confidence on CREDIT_CARD

and a threshold of 0.7 triggers the pii-cloud-block

rule.

from presidio_analyzer import AnalyzerEngine

class AuditService:
    def __init__(self):
        self.analyzer = AnalyzerEngine()

    def scan_for_pii_details(self, text: str) -> ScanResult:
        results = self.analyzer.analyze(text=text, language="en")
        detected = len(results) > 0
        entities = [
            EntityResult(
                entity_type=r.entity_type,
                confidence=r.score,
                start=r.start,
                end=r.end,
            )
            for r in results
        ]
        max_confidence = max((r.score for r in results), default=0.0)
        return ScanResult(
            detected=detected,
            entities=entities,
            max_confidence=max_confidence,
        )

The safety score is calculated separately β€” it is a 0.0–1.0 measure that combines PII confidence, entity density, and sensitive keyword presence. A score below 0.5 triggers the low-safety-warn

rule.

The engine follows a Chain of Responsibility pattern. Each rule evaluates the GovernanceContext

independently:

@dataclass
class GovernanceContext:
    prompt: str
    provider: str
    model_id: str
    pii_detected: bool
    pii_entity_types: List[str]
    pii_max_confidence: float
    safety_score: float
    estimated_prompt_cost_usd: float

class PolicyVerdict(BaseModel):
    passed: bool
    violated_rules: List[ViolatedRule] = []
    blocking_rule: Optional[ViolatedRule] = None
    warnings: List[str] = []

The DefaultPolicyEngine.evaluate()

iterates all rules in order. Block rules short-circuit. Warn and alert rules accumulate into the verdict. The verdict is returned to the FastAPI dependency, which raises HTTP 403

if blocking_rule

is set.

This is the part that makes everything composable. One line wires the entire governance stack into any endpoint:

@router.get("/stream")
async def stream_benchmark(
    verdict: PolicyVerdict = Depends(enforce_governance_policy),
    db: AsyncSession = Depends(get_db),
):
    ...

The dependency itself:

async def enforce_governance_policy(
    prompt: Annotated[str, Query(min_length=1)],
    provider: Annotated[str, Query(pattern="^(cloud|local)$")] = "cloud",
    db: AsyncSession = Depends(get_db),
) -> PolicyVerdict:
    engine = get_policy_engine()
    audit = _get_audit_service()

    scan = audit.scan_for_pii_details(prompt)

    context = GovernanceContext(
        prompt=prompt,
        provider=provider,
        model_id="gpt-4o" if provider == "cloud" else "llama3.2:latest",
        pii_detected=scan.detected,
        pii_entity_types=[e.entity_type for e in scan.entities],
        pii_max_confidence=scan.max_confidence,
        safety_score=audit.calculate_safety_score(prompt),
        estimated_prompt_cost_usd=(len(prompt.split()) * 0.00003)
        if provider == "cloud" else 0.0,
    )

    verdict = engine.evaluate(context)

    for violation in verdict.violated_rules:
        webhook_url = _get_webhook_url(engine, violation.rule_id)
        await _record_violation(db, violation, context, webhook_url)

    if not verdict.passed and verdict.blocking_rule:
        br = verdict.blocking_rule
        raise HTTPException(
            status_code=403,
            detail={
                "error": "governance_violation",
                "rule_id": br.rule_id,
                "rule_name": br.rule_name,
                "severity": br.severity,
                "message": br.message,
            },
        )

    return verdict

Every violation β€” blocked or not β€” is persisted to policy_violations

in PostgreSQL before the function returns. Webhook delivery is fire-and-forget via asyncio.create_task()

so it never adds latency to the response path.

When a rule fires with a webhook_url

, a CloudEvents-compatible payload is POSTed:

{
  "specversion": "1.0",
  "type": "com.governance.policy.violation",
  "source": "llm-governance-engine",
  "id": "uuid",
  "time": "2026-06-19T09:00:00Z",
  "data": {
    "rule_id": "pii-cloud-block",
    "rule_name": "Block PII from cloud models",
    "severity": "critical",
    "action": "block",
    "message": "PII detected (CREDIT_CARD, confidence=0.95) on cloud provider",
    "provider": "cloud",
    "model_id": "gpt-4o"
  }
}

Three delivery attempts with exponential backoff. Slack, Teams, and PagerDuty all accept this payload natively via their incoming webhook integrations.

git clone https://github.com/sochaty/llm-governance-engine
git checkout governance-post-1
cp .env.example .env
docker compose up

Dashboard β†’ http://localhost:4200

API docs β†’ http://localhost:8000/docs

Pull a local model to enable the side-by-side comparison:

curl -X POST http://localhost:11434/api/pull -d '{"name":"llama3.2:latest"}'

Trigger your first governance block:

Open the dashboard, type a prompt containing a fake SSN β€” My SSN is 123-45-6789

β€” select the Cloud provider and hit Run. You will get a red Governance Violation

banner instead of a response. The prompt never reached GPT-4o.

Open http://localhost:8000/api/v1/policies/violations

to see the audit record of the block.

Every inference β€” blocked or not β€” is stored in PostgreSQL:

Field Example
prompt (preview)
"My SSN is 123-45..."
provider
cloud
model_name
gpt-4o
pii_detected
true
safety_score
0.12
latency_ms
0 (blocked before model)
estimated_cost
$0.0000
version_tag
openai/gpt-4o

The Audit Vault page in the dashboard is filterable by prompt, provider, and PII flag. Every row has a "Generate Report" button that exports a PDF β€” useful when a compliance officer asks for evidence.

The orchestrator supports five provider types with a single interface:

Provider How it connects
OpenAI
AsyncOpenAI β€” native
Groq AsyncOpenAI(base_url="https://api.groq.com/openai/v1")
Google Gemini AsyncOpenAI(base_url="https://generativelanguage.googleapis.com/v1beta/openai")
Anthropic Lazy import anthropic β€” separate streaming path
Ollama (local) AsyncOpenAI(base_url="http://ollama-service:11434/v1", api_key="ollama")

API keys are stored in PostgreSQL (Fernet-encrypted) and resolved live on every request via settings_service.get()

. Change a key in the Settings UI β€” no restart needed, effective on the next request.

The codebase is production-ready for single-tenant use. The roadmap from here:

faithfulness_score

populates in the audit log 2–3 seconds after the benchmark completes.The incident that started this β€” a real customer's credit card number sent to GPT-4o because I forgot to sanitise a test dataset β€” took about 30 seconds to happen and would have taken weeks to untangle from a compliance perspective.

The fix took a weekend. It should have existed before the first prompt was ever sent.

Full code: github.com/sochaty/llm-governance-engine

Reproduce this post exactly: git checkout governance-post-1

PRs and issues welcome. If you build a custom Presidio recogniser for your domain (medical records, legal documents, financial instruments), I would love to include it in the default policy templates.

All my writing lives at

[blogs.sourishchakraborty.com]β€” subscribe there for future posts.

── more in #large-language-models 4 stories Β· sorted by recency
── more on @openai 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/i-built-a-pii-firewa…] indexed:0 read:7min 2026-06-19 Β· β€”