Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

wpnews.pro

cd /news/large-language-models/multi-agent-llm-system-for-automated… · home › topics › large-language-models › article

[ARTICLE · art-15590] src=arxiv.org ↗ pub=2026-05-27T17:42Z topic=large-language-models verified=true sentiment=· neutral

Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

Researchers have developed FuzzingBrain V2, a multi-agent large language model system that automatically discovers and reproduces software vulnerabilities. The system achieved a 90% detection rate on a competition dataset and found 29 zero-day vulnerabilities across 12 open-source projects, all confirmed and fixed by maintainers. The approach addresses high false positive rates and complex cross-function vulnerability reasoning by integrating fuzzer-reproducible verification and a novel control-flow-based abstraction for precise localization.

read2 min views13 publishedMay 27, 2026

[Submitted on 20 May 2026]


[View PDF](/pdf/2605.21779)

[HTML (experimental)](https://arxiv.org/html/2605.21779v1)

Abstract:Software vulnerabilities pose critical security threats, with nearly 50,000 CVEs reported in 2025. While Large Language Models (LLMs) show promise for automated vulnerability detection, three key challenges remain. First, LLM-generated vulnerability reports suffer from high false positive rates and lack

reproducible verification. Second, existing LLM-based approaches use suboptimal granularities for vulnerability localization: function-level analysis overlooks bugs when context becomes extensive, while line-level analysis lacks sufficient context. Third, existing approaches have difficulty reasoning about

vulnerabilities with complex cross-function dependencies and triggering conditions.

We present FuzzingBrain V2, a multi-agent system that addresses these gaps through four key contributions: (1) fully automated vulnerability analysis built on Google's OSS-Fuzz, ensuring all reported vulnerabilities are fuzzer-reproducible; (2) Suspicious Point, a novel control-flow-based abstraction for precise

vulnerability localization at the optimal granularity; (3) logic-driven hierarchical function analysis with dual-layer fuzzing enhancing function coverage under resource constraints; (4) MCP-based static and dynamic analysis tools with context engineering enhancing complex vulnerability reasoning.

On the AIxCC 2025 Final Competition C/C++ dataset, FuzzingBrain V2 achieved 90% detection rate (36 of 40 vulnerabilities). In real-world deployment, FuzzingBrain V2 discovered 29 zero-day vulnerabilities across 12 open-source projects, all confirmed and fixed by maintainers, with 2 assigned CVE IDs.

References & Citations

...

Bibliographic Explorer

(What is the Explorer?) Connected Papers

(What is Connected Papers?) Litmaps

(What is Litmaps?) scite Smart Citations

(What are Smart Citations?)# Code, Data and Media Associated with this Article alphaXiv

(What is alphaXiv?) CatalyzeX Code Finder for Papers

(What is CatalyzeX?) DagsHub

(What is DagsHub?) Gotit.pub

(What is GotitPub?) Hugging Face

(What is Huggingface?) ScienceCast

(What is ScienceCast?)# Demos Influence Flower

(What are Influence Flowers?) CORE Recommender

(What is CORE?)# arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

source & further reading

arxiv.org — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/multi-agent-llm-system-f…

Read original on arxiv.org → arxiv.org/abs/2605.21779

mentioned entities

FuzzingBrain V2

Google

OSS-Fuzz

CVE

metadata

slugmulti-agent-llm-system-for-automated-vulnerability-discovery-and-reproduction

topic#large-language-models

secondary4 topics

sentimentneutral

canonicalarxiv.org

navigation

← prevOne repo clone, shared forever

next →How Kernel keeps shipping fast w…

── more in #large-language-models 4 stories · sorted by recency

theregister.com · 14 Jul · #large-language-models

'The bots are alive!' Jailbroken Gemini spun up new C2 server for Russian fraudster in just 6 minutes

axios.com · 14 Jul · #large-language-models

Google's Hassabis calls for new US-led global AI watchdog "before year end"

the-decoder.com · 14 Jul · #large-language-models

Deepmind CEO Hassabis says "nobody in the world knows what happens next" so "cautious optimism" means building guardrails now

theverge.com · 14 Jul · #large-language-models

Google’s Demis Hassabis says it’s time for a global AI watchdog — led by the US

── more on @fuzzingbrain v2 3 stories trending now

wpnews · 23 May · #artificial-intelligence

AccessLens — a blind person's lanyard, powered by Gemma 4 on-device

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 21 May · #developer-tools

Antigravity CLI: A Hands-On Guide to Google's Terminal Coding Agent

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required