cd /news/developer-tools/how-we-stopped-our-ai-assistant-from… · home topics developer-tools article
[ARTICLE · art-38945] src=dev.to ↗ pub= topic=developer-tools verified=true sentiment=↑ positive

How we stopped our AI assistant from hallucinating bug fixes

LightShield, a SIEM built by LS-SIEM LLP, developed qa-probe, an open-source tool that stops AI coding assistants from hallucinating bug fixes by providing ground-truth evidence. The tool analyzes source code, probes live endpoints, and classifies root causes with calibrated confidence, enabling AI assistants to reason from evidence rather than guessing from status codes. Released under Apache-2.0, qa-probe supports frameworks like FastAPI, Express, and Next.js, and integrates via MCP with tools like Claude and Cursor.

read3 min views1 publishedJun 25, 2026

Cover: a real qa-probe run against our own stack, cropped to the summary - internal product detail withheld.

We are building LightShield, a SIEM that is in active demo right now. We built

most of it pair-programming with an AI coding assistant wired in over MCP - it

ran our stack, read the errors, and patched its own code. For a small team that

is a superpower. Until an endpoint failed.

Here is the loop we kept hitting. A route returns a 500, or a 404, or an empty

[]

. The assistant looks at the status code and announces the cause with total

confidence. Then it rewrites a handler that was never broken - because a status

code is not a cause, and it had nothing else to go on. So it guessed, and it

guessed wrong, and the diff made things worse.

The thing is, that empty []

had at least six possible causes:

Same symptom, six different fixes. We could bisect to the real one. The AI could

not - it had no ground truth, so it manufactured one.

It analyzes the app, probes the live endpoints, and classifies each failure with

a root cause and a fix hint. Three decoupled, cached phases:

qa-probe analyze   # parse source + OpenAPI -> route graph
qa-probe probe     # hit live endpoints (HTTP/SSE/WS), record evidence
qa-probe report    # classify root cause -> HTML / Markdown / JSON / AI-context

It has adapters for FastAPI, Express, Next.js, tRPC, GraphQL, and a generic

fallback, so it discovers your routes instead of you hand-listing them.

Each result carries the evidence (the real request, a bounded response sample,

the timing), a root cause from ~25 categories, and a calibrated confidence -

high

, medium

, or none

. When it cannot tell, it returns none

instead of

bluffing. No neural network, no black box - transparent rules plus per-endpoint

stat memory, so you can always read why it landed on a verdict. An AI

consuming this needs to verify the claim, not trust a vibe.

qa-probe mcp   # exposes 8 tools to Claude, Cursor, any MCP client

The assistant stopped reasoning from a status code and started reasoning from

evidence: "empty database, high confidence, here is the response that proves

it." It seeded the DB instead of rewriting the handler. It fixed the right

layer. The guessing basically stopped.

It helped us debug faster. It helped the AI more - because an AI is only as good

as the evidence you hand it, and "the endpoint is failing" is not evidence.

It is early and it is open. The fastest way to help:

One housekeeping note: contributions are sign-off based (DCO) - commit with

git commit -s

so the project's licensing stays clean. That is the only hoop.

We built it for ourselves. It worked well enough that we cleaned it up and

released it under Apache-2.0.

npm i -g qa-probe

Built by LS-SIEM LLP. If you run it against your own API, I would genuinely like

to know what it found - that feedback is how the rules get sharper.

── more in #developer-tools 4 stories · sorted by recency
── more on @ls-siem llp 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/how-we-stopped-our-a…] indexed:0 read:3min 2026-06-25 ·