Show HN: DriftGuard – response drift detection for LangGraph agents

DriftGuard, a new open-source tool for detecting response drift in LangGraph agents, was released on Hacker News. It uses embedding-based comparison to flag when an LLM strays from its intended domain without requiring ground-truth labels or separate classifiers. The tool provides adaptive thresholds and supports integration as a guardrail or monitoring callback.

Embedding-based response drift detection for LangChain agents. Detects when an LLM starts answering outside its intended domain a legal assistant drifting into cooking advice, a medical chatbot wandering into finance without ground-truth labels or a separate classifier. How it works how-it-works Installation installation Quick start quick-start Integration patterns integration-patterns LangGraph guardrail langgraph-guardrail Async support async-support Alert sinks alert-sinks Multi-topic corpora multi-topic-corpora-clustering Domain auditing domain-auditing Building a corpus with FPS building-a-corpus-with-fps Distribution-level detection distribution-level-detection-windowed Visualisation visualisation Persisting a corpus persisting-a-corpus DriftResult reference driftresult-reference Development development Build a reference corpus from representative on-topic texts. Embed each LLM response with the same model. Compare using two complementary signals: Centroid distance : how close is the response to the centre of the corpus or its nearest cluster ? Nearest-neighbour distance : is the response close to at least one reference text? Flag drift when both signals agree the response is far from the reference domain. Using both signals reduces false positives: a paraphrase that sits slightly off the centroid is rescued when it's still close to a known reference text. The threshold for each signal is adaptive : the 5th percentile of within-corpus similarity scores, so ~95% of reference texts clear it with no manual tuning. git clone https://github.com/vinerya/driftguard.git cd driftguard pip install -r requirements.txt pip install -e . Requires Python ≥ 3.9. The only runtime dependencies are langchain-core and numpy . Optional extras: pip install -e ". viz " matplotlib + scikit-learn for corpus.plot pip install langgraph LangGraph guardrail nodes python from driftguard import ReferenceCorpus, DriftDetector from langchain openai import OpenAIEmbeddings embeddings = OpenAIEmbeddings 1. Build the reference corpus from representative on-topic texts corpus = ReferenceCorpus embeddings model=embeddings corpus.add texts "tort law", "contract formation", "negligence standard", "criminal intent", "due process rights", 2. Create the detector detector = DriftDetector corpus=corpus 3. Check a response result = detector.check "habeas corpus" print result.is drift False on-topic print result.centroid similarity e.g. 0.91 print result.max reference similarity e.g. 0.95 print result.threshold e.g. 0.87 result = detector.check "best pasta recipe" print result.is drift True off-topic Attach to any LangChain LLM or chat model. Runs on every response without interrupting the pipeline; use for monitoring, logging, or metrics. python from driftguard import DriftCallbackHandler, AlertManager alerts = AlertManager sinks= "log" handler = DriftCallbackHandler detector=detector, alerts=alerts llm = ChatOpenAI callbacks= handler response = llm.invoke "What is the recipe for tiramisu?" Drift is logged as a WARNING; the response still returns normally. print handler.history -1 .is drift True Insert as a step in a LangChain chain. Raises DriftError on drift; passes the text through unchanged otherwise. python from driftguard import DriftRunnable, DriftError from langchain core.output parsers import StrOutputParser drift = DriftRunnable detector=detector chain = llm | StrOutputParser | drift.as guard try: result = chain.invoke "What is the recipe for tiramisu?" except DriftError as e: print f"Blocked: centroid sim={e.result.centroid similarity:.3f} " f"< threshold={e.result.threshold:.3f}" Annotates the chain output with drift metadata without halting. Useful when you want to observe drift but let the response through for the user to see. chain = llm | StrOutputParser | drift.as passthrough output = chain.invoke "habeas corpus" {"output": "Habeas corpus is a legal right...", "drift": DriftResult ... } print output "drift" .is drift False driftguard ships a first-class LangGraph integration. The node and routing helpers are plain callables that match LangGraph's expected signatures, no LangGraph import inside the library itself, so the module loads fine even if LangGraph isn't installed. python from langgraph.graph import StateGraph from typing import Any from typing extensions import TypedDict from driftguard.langgraph import drift node, route on drift class AgentState TypedDict : query: str response: str drift: Any holds the DriftResult written by the drift node graph = StateGraph AgentState graph.add node "llm", call llm writes state "response" graph.add node "drift check", drift node detector reads "response", writes "drift" graph.add node "fallback", handle fallback graph.add node "respond", finalize graph.set entry point "llm" graph.add edge "llm", "drift check" graph.add conditional edges "drift check", route on drift, returns "drift" or "ok" {"drift": "fallback", "ok": "respond"}, app = graph.compile Custom state key : if your LLM node writes to a key other than "response" : graph.add node "drift check", drift node detector, text key="output" Async graphs : swap drift node for adrift node : python from driftguard.langgraph import adrift node graph.add node "drift check", adrift node detector Custom route labels : use make route on drift when your edge map uses different names: python from driftguard.langgraph import make route on drift router = make route on drift on drift="blocked", on ok="continue" graph.add conditional edges "drift check", router, {"blocked": "fallback", "continue": "respond"} Every public method has an async counterpart: await corpus.aadd texts "tort law", "negligence" result = await detector.acheck "contract formation" AsyncDriftCallbackHandler mirrors DriftCallbackHandler for async LangChain pipelines. AlertManager dispatches drift alerts to one or more sinks simultaneously: python from driftguard import AlertManager alerts = AlertManager sinks= "log", WARNING via Python logging "https://your-service.example/webhook", POST JSON payload lambda result: my queue.put result , arbitrary sync or async callable Pass an AlertManager instance to DriftCallbackHandler , DriftRunnable , or the LangGraph nodes; all accept one via the alerts argument. When your reference corpus spans several distinct topics, a single global centroid produces false positives for texts that are on-topic but far from the average. Set n clusters to partition the corpus into groups; each query is then compared to its nearest cluster rather than the global centre. corpus = ReferenceCorpus embeddings model=embeddings, n clusters=2 corpus.add texts Legal cluster "tort law", "contract formation", "negligence", Medical cluster "malpractice", "diagnosis", "clinical trial", detector = DriftDetector corpus=corpus detector.check "habeas corpus" .is drift False routes to legal cluster detector.check "prognosis" .is drift False routes to medical cluster detector.check "pasta recipe" .is drift True far from both clusters Clustering uses numpy k-means internally with no extra dependencies. The Auditor class runs drift detection over a batch of historical responses and returns a structured report: pass rate, score distribution, flagged outliers. Use it before deployment to validate your corpus, after incidents to understand what went wrong, or in CI to catch domain regressions between prompt versions. python from driftguard import Auditor auditor = Auditor detector report = auditor.run production responses print f"Pass rate: {report.pass rate:.1%}" print f"Drift rate: {report.drift rate:.1%}" print f"Flagged: {report.flagged} / {report.total}" Export the report for a compliance doc or CI artifact: report.to json structured JSON string open "report.html", "w" .write report.to html self-contained HTML report The HTML report includes a summary dashboard, centroid similarity distribution p5 → p95 , and a table of all flagged responses with their scores. Async : all responses are checked concurrently: report = await auditor.arun production responses Detect domain shift between prompt versions, model upgrades, or dataset changes: comparison = corpus v1.compare corpus v2 print f"Centroid shift: {comparison.centroid shift:.4f}" cosine distance print f"Threshold delta: {comparison.threshold delta:+.4f}" print f"Significant: {comparison.is significant}" shift 0.05 A centroid shift above 0.05 configurable via significant shift threshold means the two corpora represent meaningfully different domains, worth investigating before swapping one for the other. Hand-picking reference texts is tedious and easy to get wrong. ReferenceCorpus.from texts accepts a large pool of candidates and uses Farthest Point Sampling to automatically select the n most coverage-maximising texts; each new selection is the one farthest in cosine distance from all already-chosen texts. 500 example legal responses; pick the 30 most diverse ones. corpus = ReferenceCorpus.from texts candidates=my 500 legal responses, embeddings model=embeddings, n=30, The result is a fully initialised ReferenceCorpus ready for use with DriftDetector . An async variant is also available: corpus = await ReferenceCorpus.afrom texts candidates, embeddings model=embeddings, n=30 Per-response checks are sensitive to one-off anomalies. WindowedDriftDetector accumulates a sliding window of responses and checks whether the window's embedding distribution has shifted from the reference. Two signals can trigger drift: Centroid shift : the window's mean embedding has moved away from the reference. Drift fraction : more than drift fraction threshold default 30% of recent responses are individually off-topic. python from driftguard import WindowedDriftDetector wd = WindowedDriftDetector corpus=corpus, window size=20, drift fraction threshold=0.3 for response in llm responses: result = wd.update response if result is None: continue window still filling if result.is drift: print f"Window drift detected: " f"centroid sim={result.window centroid similarity:.3f}, " f"drift fraction={result.drift fraction:.0%}" result is a WindowDriftResult returned on every call once the window is full. Use on drift for async-friendly callbacks: wd = WindowedDriftDetector corpus=corpus, on drift=lambda r: alert queue.put r Async usage mirrors the sync API: result = await wd.aupdate response corpus.plot projects the reference corpus into 2D via t-SNE and optionally overlays texts colour-coded by drift status, useful for debugging false positives and tuning threshold percentile . pip install driftguard viz adds matplotlib + scikit-learn corpus.plot check texts= "habeas corpus", "pasta recipe", "clinical trial" Blue circles are reference texts; green triangles are on-topic detections; red X markers are flagged as drift. For more control, call plot corpus directly: python from driftguard.viz import plot corpus import matplotlib.pyplot as plt fig, ax = plt.subplots figsize= 10, 7 plot corpus corpus, check texts=probe texts, ax=ax plt.show Save a trained corpus to disk and reload it on the next run, no need to re-embed reference texts every time. corpus.save "legal corpus" writes legal corpus.npz embeddings, centroid, thresholds, cluster data legal corpus.texts.json original texts loaded = ReferenceCorpus embeddings model=embeddings loaded.load "legal corpus" Cluster data centroids, per-cluster thresholds is persisted alongside the embeddings. Every call to detector.check or detector.acheck returns a frozen DriftResult : | Field | Type | Description | |---|---|---| is drift | bool | True when both centroid and NN signals indicate drift | centroid similarity | float | Cosine similarity to the nearest cluster or global centroid | max reference similarity | float | Cosine similarity to the closest individual reference text | threshold | float | Adaptive centroid threshold for this check | nn threshold | float | Adaptive nearest-neighbour threshold | text | str | The checked text | timestamp | float | Unix timestamp | metadata | dict | Any kwargs passed to check , e.g. run id | DriftError raised by as guard exposes the full DriftResult on its .result attribute. WindowedDriftDetector.update returns a WindowDriftResult once the window is full: | Field | Type | Description | |---|---|---| is drift | bool | True when centroid or fraction signal fires | window centroid similarity | float | Cosine similarity of window centroid to reference | drift fraction | float | Fraction of window responses individually flagged | window size | int | Number of responses in the window | threshold | float | Reference threshold used for centroid check | drift fraction threshold | float | Configured fraction threshold | timestamp | float | Unix timestamp | pip install -e ". dev " pytest All tests use deterministic FakeEmbeddings , no API key or network access required. MIT