What I learned from my first AI-assisted bug bounty submissions

wpnews.pro

cd /news/artificial-intelligence/what-i-learned-from-my-first-ai-assi… · home › topics › artificial-intelligence › article

[ARTICLE · art-17192] src=dev.to ↗ pub=2026-05-29T04:08Z topic=artificial-intelligence verified=true sentiment=· neutral

What I learned from my first AI-assisted bug bounty submissions

A developer used Claude (Opus) to find, verify, and write up vulnerabilities in public open-source bug bounty programs, discovering that the biggest risk to a submission is not whether a bug is real, but whether someone already reported it. The engineer built a novelty-checking toolchain using GitHub and OSV APIs to reduce duplicate risk, but found that privately submitted reports remain invisible until disclosure, causing one correct finding to be closed as a duplicate of an earlier submission. The project revealed that AI-assisted discovery has congested the open-source bounty landscape, with fewer programs and lower payouts in early 2026, and that the optimal strategy requires prioritizing verification over generation since each submission costs scarce reputation points.

read4 min views11 publishedMay 29, 2026

Third post in my "AI-assisted OSS contribution" series. The first two were about

[pre-fork due diligence]and[shipping a fix to ONNX with my own PR scanner]. This one is about a harder game: security research and coordinated disclosure.

For a while my AI-assisted open-source work was about contributions — typo fixes, docs, small bug fixes, the occasional feature. Pull requests have a forgiving feedback loop: if a PR is wrong, a maintainer comments and you iterate. Bug bounty work is different. The feedback loop is slower, the bar for "novel and correct" is much higher, and a lot of the difficulty has nothing to do with the vulnerability itself. I ran a small experiment: use Claude (Opus) to help me find, verify, and write up vulnerabilities in public, in-scope open-source bug bounty programs — the kind that publish a scope and a safe-harbor policy and explicitly invite testing. Here's what actually mattered, mostly the things I didn't expect.

The single biggest risk to a bounty submission is not "is it a real bug" — it's "did someone already report it." And you usually cannot see the answer.

I built a small novelty-checking toolchain around the assistant: query published advisories (GHSA via the GitHub API), aggregate cross-ecosystem advisory data (OSV), search the target repository's own issues and PRs, and pull recent security-research feeds. It catches a lot. But it has a fundamental blind spot: privately submitted reports are invisible until they're disclosed. One of my submissions was closed as a duplicate of a report filed months earlier that I had no way of seeing. The finding was correct. It just wasn't first.

The lesson isn't "check harder." Public OSINT can only ever reduce duplicate risk, never eliminate it. The realistic takeaways:

It is very easy to read code, build a clean mental model of a bug, write a confident report — and be wrong, because the runtime doesn't behave the way the reference manual says it does. I got burned by exactly this kind of gap between "what the spec says" and "what the implementation does."

The discipline that fixed it: no claim without a runnable proof of concept, executed against the actual runtime. Not pseudocode. Not "this should work." A minimal, contained reproduction on my own machine — localhost only, no third-party or production systems touched — that either fires or it doesn't. An AI assistant is genuinely good at the first 80% of building that PoC fast; the last 20% (does it actually reproduce?) is non-negotiable and is where most false positives die.

Modern bounty platforms ration your ability to submit. New researchers get a limited number of "trial" reports, and a reputation/signal score that drops when you file invalid or duplicate reports — low enough, and you get blocked from submitting at all.

This completely changes the optimal strategy. When submissions are cheap, volume wins. When each submission costs scarce signal, quality dominates volume, and a single duplicate or "informative" close is genuinely expensive. With an AI assistant that can generate plausible-looking reports quickly, this is the most important guardrail: the bottleneck must be verification, not generation.

Some honest context, because it shaped my results. The open-source bounty landscape contracted noticeably in early 2026:

A widely-cited reason: AI-assisted discovery started producing vulnerability reports faster than open-source maintainers could triage and remediate them. The irony isn't lost on me — the same tooling that makes an individual researcher more productive, in aggregate, helped congest the system that pays them. If you're starting now, plan for fewer open programs and lower-but-real payouts than the headline numbers from a year ago.

I disclose AI assistance in every submission. Not as a disclaimer-shaped apology — as a fact, the same way you'd note any tool in your methodology. Two practical reasons beyond honesty:

The model does the heavy lifting on code review, hypothesis generation, and drafting. I own scope selection, the decision to submit, the ethics, and the final verification. That division of labor is the whole point.

I'm still early — a couple of submissions in, one under triage as I write this, plenty unproven. But the meta-lessons above transferred cleanly from the PR work in my earlier posts: the assistant compresses the mechanical effort, and that just relocates all the value to judgment — what to look at, whether it's really true, and whether you should hit submit.

Developed with AI assistance (Claude Opus); all findings were reviewed, reproduced locally, and verified by me before submission. No unpatched or undisclosed vulnerability details are included in this post.

source & further reading

dev.to — original article Local-first RAG for privileged legal documents: why citations need verification Pollux: Let's explore Nigerian political sentiment A Folder of Docs Is Not a Knowledge Base

~/api · this article 200

$curl api.wpnews.pro/v1/news/what-i-learned-from-my-f…

Read original on dev.to → dev.to/taiman724/what-i-learned-from-my-first-ai…

mentioned entities

Claude

Opus

GitHub

ONNX

metadata

slugwhat-i-learned-from-my-first-ai-assisted-bug-bounty-submissions

topic#artificial-intelligence

secondary4 topics

sentimentneutral

canonicaldev.to

navigation

← prevTrees are mostly made of air and…

next →Free Model Providers to Use with…

── more in #artificial-intelligence 4 stories · sorted by recency

zenveil.dev · 14 Jul · #artificial-intelligence

Show HN: Zenveil-Security scanning for AI-generated code

developers.redhat.com · 13 Jul · #artificial-intelligence

Dependency analytics 1.0: AI coding with supply chain security

pub.towardsai.net · 14 Jul · #artificial-intelligence

Fable 5 Beats GPT-5.6 by 15.7 Points — Devs Are Quitting Claude Code for Codex Anyway

nextgov.com · 14 Jul · #artificial-intelligence

AI, once relegated to helping hackers with certain tasks, can now power every stage of a cyberattack

── more on @claude 3 stories trending now

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

wpnews · 8 Jul · #large-language-models

Gemini 3.5 Pro Delayed to July 17: Architectural Rebuild Explained

wpnews · 8 Jul · #artificial-intelligence

Google Gemini Killed Perplexity AI

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required