The open standard AI gateway for production systems.
PII protection Β· audit trails Β· reliability Β· cost control β as composable interceptors. Same API in Python, Java, and JavaScript.
π Docs: manojmallick.github.io/gavio
Gavio sits between your application and any LLM provider. Every request passes through a pre/post interceptor chain β PII redaction, retries, cost tracking, audit logging β before and after the provider call:
Request β [ PII Guard Β· Secret Scanner Β· β¦ ] β Provider β [ β¦ Β· PII Restore Β· Audit ] β Response
Every team re-implements the same production concerns around LLM calls: redact PII before it leaves the building, retry on 429s, fall back to a second provider, log an audit trail, track spend. Gavio ships them once, as swappable interceptors, with identical behaviour across three languages β enforced by shared test vectors.
Provider-agnosticβ OpenAI, Anthropic, Gemini, Azure, Ollama, Mock. Switching is a config change.** Zero mandatory dependenciesin every core (stdlib HTTP everywhere β no vendor SDKs). Dev mode**β the whole stack runs in-process with a mock provider. No API key, no network.** Audit by default**β every call logged as metadata + SHA-256 content hashes (never raw text).** Inspector**β opt-in dev-time visualizer: live traces, per-interceptor waterfall, PII redaction diffs, and pipeline lints athttp://127.0.0.1:7411
(inspect(true)
orGAVIO_INSPECT=1
).Inspector agentic & production modeβ multi-agent call graphs and session views, trace replay & edit-resend (full mode only), RED stats, hash-chain verification, PII-sanitized export of any trace as a test case, and a read-only dashboard over a persisted audit store:gavio inspect --store audit.jsonl
.
Status:v0.9.0 (Embedding call guard β F-SEC-10). Semver stability holds since v0.2.0; pre-1.0, some APIs may still change. See the[CHANGELOG].
Gavio is a thin core (Gateway
-
InterceptorChain -
the request/response model) that everything else plugs into. A request flows through a pre pipeline, hits a provider adapter, then flows back through a post pipeline in reverse order:
βββββββββββββββββββββββββ Gateway.complete(request) βββββββββββββββββββββββββ
β β
request β PRE ββΆ PiiGuard ββΆ SecretScanner ββΆ PromptInjectionGuard ββΆ RateLimiter β
ββββββββΆ β CostControl ββΆ CostRouter ββΆ SemanticCache βββ β
β β (cache miss) β
β ββββββββββββΌβββββββββββ β
β β Provider Adapter β β
β β OpenAI Β· Anthropic Β· β β
β β Gemini Β· Azure Β· β β
β β Ollama Β· Mock β β
β ββββββββββββ¬βββββββββββ β
β Guardrails ββ RiskScorer ββ PiiRestore βββββββ β
ββββββββ β POST ββ Metrics ββ AuditInterceptor (hash-chained record) β response
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interceptors implementbefore()
/after()
/onError()
. Order is explicit β PII redaction runs before audit; audit runs last so it records what every other interceptor did. Seedocs/architecture.md.Executor policies(cache, retry, circuit breaker, load balancer, fallback) wrap the provider call itself β a cache hit or an open circuit short-circuits the provider entirely.The audit record is metadata-only. Prompts and responses are stored as SHA-256 hashes, never raw text; PII entitytypes and countsare logged, never values. Records are hash-chained (F-OBS-02
) so any tampering is detectable.
Core data model β identical fields across all three SDKs, defined once in spec/ as JSON Schema and enforced by
GavioRequest |
GavioResponse |
AuditRecord |
|---|---|---|
trace_id (UUID v7) |
trace_id |
trace_id Β· parent_trace_id |
agent_id Β· parent_trace_id |
content (PII restored) |
prompt_hash Β· response_hash |
messages Β· model Β· provider |
usage Β· cost_usd Β· latency_ms |
pii_entity_types Β· risk_score |
options Β· lineage Β· metadata |
cache_hit Β· cache_type |
previous_hash Β· lineage Β· schema_version |
| Python | JavaScript / TypeScript | Java |
|---|---|---|
from gavio import Gateway
from gavio.interceptors.pii import PiiGuard
gw = (Gateway.builder()
.dev_mode(True)
.use(PiiGuard())
.build())
r = await gw.complete(messages=[
{"role": "user",
"content": "mail jan@example.com"}])
print(r.content) # PII restored
print(r.audit.pii_entity_types)
|
import { Gateway } from 'gavio'
import { piiGuard } from 'gavio/interceptors/pii'
const gw = new Gateway({ devMode: true })
.use(piiGuard())
const r = await gw.complete({ messages: [
{ role: 'user',
content: 'mail jan@example.com' }] })
console.log(r.content) // PII restored
console.log(r.audit.piiEntityTypes)
|
Gateway gw = Gateway.builder()
.devMode(true)
.use(new PiiGuard())
.build();
var r = gw.complete(GavioRequest.builder()
.message("user", "mail jan@example.com")
.build()).join();
System.out.println(r.content());
System.out.println(r.audit().piiEntityTypes());
|
All three print the reply with the email restored, and an audit record
showing EMAIL
was detected and redacted before the (mock) provider ever saw it.
| Language | Command | Docs |
|---|---|---|
| Python 3.10+ | ||
pip install gavio |
||
JavaScript(Node 18+)npm install gavio
packages/gavio-jsΒ·docs/packages/javascript.mdJava 17+ (Maven)io.github.manojmallick:gavio-core:0.9.0
packages/gavio-javaΒ·docs/packages/java.mdGavio is a monorepo. Each SDK is independently versioned-in-lockstep and published to its native registry.
The reference implementation. Async-first (await gw.complete(...)
), sync
wrapper (complete_sync
), full type hints + py.typed
. Zero mandatory deps;
gavio[redis]
adds a distributed cache backend, other optional extras
(gavio[presidio]
, β¦) land in later versions.
pip install gavio
β ** Full Python guide** Β·
Written in TypeScript, ships full type definitions, dual ESM + CJS build
with per-subpath exports
for tree-shaking. Native fetch
, node:crypto
. Node 18+, Deno, Bun.
npm install gavio
β ** Full JavaScript guide** Β·
Multi-artifact Maven project: gavio-core
plus one artifact per interceptor
family (gavio-interceptor-pii
, -audit
, -reliability
, -cache
,
-governance
, -guardrails
, -metrics
, -quality
), one per provider
(gavio-provider-openai
, -anthropic
, -gemini
, -azure
, -ollama
), and
gavio-testing
. Immutable records + builders, CompletableFuture
async, Java 17+.
<dependency>
<groupId>io.github.manojmallick</groupId>
<artifactId>gavio-core</artifactId>
<version>0.9.0</version>
</dependency>
β ** Full Java guide** Β·
Every feature below lands in all three SDKs in lockstep, at the same version, gated by the same shared test vectors.
| Feature | ID | Since |
|---|---|---|
| PII Guard β Email, IBANΒ·mod-97, BSNΒ·11-proef, CreditCardΒ·Luhn, Phone, IP, SSN | F-SEC-01 |
|
| 0.1.0 | ||
Secret scanner β API keys, AWS AKIA , GitHub tokens, JWT, PEM, DB URLs |
||
F-SEC-04 |
||
| 0.1.0 | ||
| Prompt-injection defense β pattern corpus + optional semantic similarity | F-SEC-05 |
|
| 0.2.0 | ||
Embedding call guard β gw.embed(texts) runs the same PII pipeline before embedding APIs |
||
F-SEC-10 |
||
| 0.9.0 |
| Feature | ID | Since |
|---|---|---|
| Retry (exp backoff + jitter), Fallback chain, Timeout | F-REL-01/02/07 |
|
| 0.1.0 | ||
| Circuit breaker, Load balancer (weighted round-robin) | F-REL-03/04 |
|
| 0.2.0 | ||
| Streaming reliability β buffer response before post-interceptors run | F-REL-06 |
|
| 0.3.0 |
| Feature | ID | Since |
|---|---|---|
Per-request cost_usd tracking (all providers) |
||
F-GOV-01 |
||
| 0.1.0 | ||
| Budget caps (soft/hard), rate limiting, model RBAC | F-GOV-02/03/04 |
|
| 0.2.0 | ||
| Cost-optimiser routing β reroute simple prompts to a cheaper model | ||
F-GOV-06 |
||
| 0.5.0 |
| Feature | ID | Since |
|---|---|---|
| Semantic + exact cache (cosine + SHA-256), in-memory backends | F-CACHE-01/02/03 |
|
| 0.2.0 | ||
| Redis distributed backend (shared hits across processes, zero-dep RESP2) | F-CACHE-04 |
|
| 0.4.0 |
| Feature | ID | Since |
|---|---|---|
Audit interceptor + AuditRecord (SHA-256 hashes), stdout sink |
||
F-OBS-01/05 |
||
| 0.1.0 | ||
| Hash-chain (tamper-evident) audit, multi-agent DAG trace | F-OBS-02/03 |
|
| 0.2.0 | ||
| Prompt lineage (template + variables + RAG sources) | F-OBS-04 |
|
| 0.3.0 | ||
| Prometheus metrics (zero-dep text exposition) | F-OBS-08 |
|
| 0.3.0 | ||
| Guardrails β JSON-schema + regex allow/deny | F-QUA-01/02 |
|
| 0.2.0 | ||
| Composite risk scoring (PII + guardrail + injection signals) | F-QUA-06 |
|
| 0.3.0 | ||
JSONL audit sink (jsonl://<path> ) β the store the production dashboard reads |
||
F-DX-08 |
||
| 0.7.0 |
| Feature | ID | Since |
|---|---|---|
| Dev-time visualizer β live traces (SSE), waterfalls, PII diffs, pipeline lints, embedded UI | F-DX-09/10 |
|
| 0.6.0 | ||
Agent call graphs + session views (/api/dag , /api/sessions ) |
||
F-OBS-10 |
||
| 0.7.0 | ||
| Trace replay & edit-resend (full capture mode only) | F-DX-11 |
|
| 0.7.0 | ||
Read-only production dashboard β RED stats, hash-chain verifier, gavio inspect --store |
||
F-DX-08 |
||
| 0.7.0 | ||
Export any trace as a PII-sanitized GavioTestKit test / test vector |
||
F-DX-12 |
||
| 0.7.0 | ||
| Overhead benchmarks with CI-enforced budget (<1% metadata / <5% full p50) | F-DX-09 |
|
| 0.8.0 |
| Feature | ID | Since |
|---|---|---|
Dev mode, dry-run mode, GavioTestKit |
||
F-DX-01/02/03 |
||
| 0.1.0 | ||
| OpenAI drop-in shim, config | F-DX-04/05 |
|
| 0.2.0 | ||
| Providers β OpenAI Β· Anthropic Β· Gemini Β· Azure OpenAI Β· Ollama Β· Mock (all stdlib HTTP, no vendor SDKs) | ||
| β | 0.1β0.2 |
Conformance-tested across all three SDKs on every push and PR ( ci.yml runs Python 3.10β3.12, Node 18/20/22, Java 17/21). Per-release test totals are in the
CHANGELOG; see the
interceptors guidefor every built-in interceptor.
Not yet shipped(tracked on the roadmap): image PII ([#29],F-SEC-09
), drift detection ([#31],F-GOV-07
), right-to-erasure ([#32],F-QUA-09
), license detection ([#33],F-QUA-10
).
P1 Β· Interface-firstβ every feature is a public interface you can swap or extend.** P2 Β· Interceptor chain**β pre/post hooks, explicit composition, no hidden magic.** P3 Β· Provider-agnostic**β no provider-specific code leaks into your app.** P4 Β· Zero infra in dev**βdev_mode
runs everything in-process.P5 Β· Audit by defaultβ opt-out, not opt-in.** P6 Β· Embeddable library**β runs in-process, no sidecar or proxy required.** P7 Β· Dry-run first**β log whatwouldhappen without blocking.P8 Β· Typed everywhereβ TS generics, Python hints, Java generics.
| Doc | What |
|---|---|
docs/architecture.mddocs/interceptors.mddocs/inspector.mddocs/otel-mapping.mdGrafana dashboarddocs/packages/examples/spec/test-vectors/RELEASING.mdCONTRIBUTING.md
gavio/
βββ spec/ canonical data model (JSON Schema)
βββ test-vectors/ shared cases every SDK must pass
βββ packages/
β βββ gavio-py/ Python SDK (PyPI: gavio)
β βββ gavio-js/ JS/TS SDK (npm: gavio)
β βββ gavio-java/ Java SDK (Maven: io.github.manojmallick:gavio-*)
βββ docs/ documentation
βββ .github/workflows/ ci.yml (test all 3) Β· release.yml (publish all 3)
MIT Β© 2026 Manoj Mallick
MIT Β© 2026 Manoj Mallick Β· Made in Amsterdam π³π±