Tiny prompt-injection firewall for LLM chat apps. ~14 MB. CPU-only. Drop-in guard between your user input and your LLM β runs on the same box, no GPU, no API, no extra service.
Built by the
[SecureLayer7]red-team. Most OSS guardrails are hundreds of MB, want a GPU, and still miss the attacks we see in production. We needed something we could ship inside our own AI products and our customers' apps without any of that.
| promptpurify | typical OSS guardrail | |
|---|---|---|
| Install size | ~14 MB ONNX | |
| 180 MB β 7 GB | ||
| Inference | CPU, single-digit ms | |
| GPU recommended | ||
| Where it runs | In your Node process | |
| Sidecar or hosted API | ||
| Cost per call | $0 | |
| $ or GPU compute |
Benchmark comparison vs OSS baselines β docs/BENCHMARKS.md.
npm i promptpurify
npm i onnxruntime-node
curl -L -o promptpurify-model.tar.gz \
https://github.com/securelayer7/PROMPTPurify/releases/download/v0.0.1/promptpurify-model.tar.gz
curl -L -o promptpurify-model.tar.gz.sha256 \
https://github.com/securelayer7/PROMPTPurify/releases/download/v0.0.1/promptpurify-model.tar.gz.sha256
sha256sum -c promptpurify-model.tar.gz.sha256 # MUST print "OK"
tar xzf promptpurify-model.tar.gz # creates models/l5e/
The model isn't in the npm tarball β the SDK stays tiny for people who only want the structural firewall (browser, edge, RAG). Full distribution options: docs/SAMPLE-DATA.md.
import { createL5eRunner } from "promptpurify/l5";
const guard = await createL5eRunner();
// In your /chat handler:
const score = await guard.score(userMessage);
if (score >= 0.95) return refusal(); // hard block
if (score >= 0.85) flagForReview(userMessage); // advisory
const reply = await yourLLM.complete(userMessage); // pass through
Works with Groq, OpenAI, Anthropic, vLLM, local LLMs β promptpurify never talks to your LLM, only to your input.
For the deterministic structural firewall (Unicode neutralization, role-fenced messages, output exfil guard) see docs/QUICKSTART.md.
We built our model from random initialization because no existing OSS guardrail gave us the size / latency tradeoff we wanted to ship in our own products.
From-scratch. No teacher weights from any vendor classifier are redistributed.Benchmarked against public datasets for direct comparison with OSS baselines (ProtectAI v2, deepset, fmops, Meta Prompt-Guard-2). Held-out evaluation; false positives reported alongside recall.MIT-licensed weights. Use in production, paid or free.
Full architecture overview β docs/HOW-IT-WORKS.md.
We run a live adversarial challenge at ** anton.securelayer7.net**. Ask Son of Anton for the password. If you can get it past the guard, tell us how β
A fintech customer-support chatbot wired up with promptpurify, ready to run locally:
cd examples/customer-support && npm install
GROQ_API_KEY=gsk_... node server.mjs
See examples/customer-support/README.md.
β install paths, structural firewall, browser bundle, integration patterns.docs/QUICKSTART.mdβ the layers, what each catches.docs/HOW-IT-WORKS.mdβ comparison with OSS baselines, methodology.docs/BENCHMARKS.mdβ what ships in the repo for benchmarking.docs/SAMPLE-DATA.mdβ run the bench yourself.docs/REPRODUCE.mdβ what to pair promptpurify with for full coverage.docs/HONEST-LIMITS.md
- Not a guarantee. There is no
.safe
boolean. - Not a content classifier. Catches prompt-injection, not toxicity / CSAM / hate. Pair with a content filter.
- Not a multi-turn auditor. Pair with conversation-level monitoring.
The name and the design philosophy are inspired by DOMPurify by
Cure53β the same idea, applied to LLM prompts instead of HTML. Thanks to
Mario Heiderich for suggesting the name.
MIT for the SDK and the model weights. Benchmark sources we evaluate against are listed in training/CORPUS_LICENSES.json.
Security disclosures: SECURITY.md.