Two weeks ago I launched AgentGuard, an open-source static analysis tool for AI agent security. Here is what the data says.
PyPI discoverability is real. The pip install
flow means people find the package through PyPI search, not just GitHub. 920 downloads with 1 star means the conversion from "found it" to "starred it" is low, but the install-to-star ratio is normal for developer tools.
Dev.to drives traffic. The "Beyond Regex" technical deep-dive got 44 views -- the most of any article. Developers want technical depth, not marketing. The MCP security guide got a real comment from a peer building a complementary tool.
OWASP ASI Top 10 is a strong positioning. Nobody else covers all 10 categories in an open-source tool. That differentiation matters.
Awesome-list PRs are slow. Two PRs submitted, both mergeable, neither merged after a week. Maintainers of these lists have their own priorities. Not a failure, just patience required.
Zero community contributions. No PRs from external contributors. The good-first-issues are there (Go support, Java support) but nobody has picked them up. The project needs more visibility before contributors arrive.
The 0-stars problem is self-reinforcing. Developers check star counts before trying a tool. 1 star does not signal "trusted." This is the hardest loop to break for new open-source projects.
Launch with a comparison table. AgentGuard vs. Semgrep vs. CodeQL for AI agent security. Developers want to know "why not just use Semgrep?" before they install anything.
Ship the GitHub Action on day one. The action.yml
was added in v0.3.4 -- two weeks late. CI/CD integration is the #1 thing developers look for in a security tool.
Write a "how to break an AI agent" post first. Show the vulnerability, then the tool. The MCP security guide performed best because it led with the problem, not the product.
The biggest engineering mistake was scanning my own code. AgentGuard's regex rules match patterns like eval\(
and os\.system
-- which appear in the rule definitions themselves. First self-scan: 94 findings, 69 critical. All false positives.
The fix: skip the rules/
directory and test files by default. Add --include-tests
for explicit test scanning. Self-scan went from 94 to 2 (acceptable patterns in setup.py).
Lesson: your security tool should be able to scan itself without screaming. If it cannot, users will not trust it on their code either.
The next milestone is AST-based taint tracking (v0.4.0). Regex gets you 100% on a curated benchmark, but real codebases have patterns regex cannot see:
template = "Answer: {input}"
prompt = template.format(input=user_data)
This requires parsing the AST, tracking variable assignments, and following data flow from sources to sinks. Same approach as Semgrep and CodeQL, but specialized for LLM-specific sinks (openai.chat.completions.create
, messages
, prompt
).
If you want to follow along or contribute: github.com/dockfixlabs/agentguard
AgentGuard is MIT-licensed. Install with pip install dfx-agentguard.