Microsoft Introduces MDASH for Large-Scale AI Vulnerability Research

Microsoft has introduced MDASH, a multi-model agentic security platform that uses over 100 specialized AI agents to automate large-scale vulnerability discovery across Windows, Hyper-V, and Azure codebases. The system achieved an 88.45% score on the CyberGym benchmark and 100% recall on historical tcpip.sys vulnerabilities, outperforming leading industry benchmarks. The release signals a shift toward orchestrated AI security systems where coordinated agents and validation frameworks matter more than individual model capability.

Microsoft has introduced a new AI-driven vulnerability discovery system called MDASH https://www.microsoft.com/en-us/security/blog/2026/05/12/defense-at-ai-speed-microsofts-new-multi-model-agentic-security-system-tops-leading-industry-benchmark/?v=1 , a multi-model agentic security platform designed to automate large-scale code auditing across Windows and other Microsoft software environments. The system combines more than 100 specialized AI agents that work together to scan, validate, debate, and prove vulnerabilities across complex codebases. The announcement indicates a transition in AI-assisted cybersecurity from individual model testing to more integrated systems that focus on coordinated agents, validation processes, and automated proof generation. Microsoft emphasizes that the overall framework surrounding the models is more significant than any single model, particularly for extensive proprietary codebases like Windows, Hyper-V, and Azure. According to Microsoft, MDASH achieved an 88.45% score on the public CyberGym benchmark of 1,507 real-world vulnerabilities, outperforming the next highest entry by roughly five points. Internally, the company reports 96% recall on historical clfs.sys vulnerabilities reviewed by the Microsoft Security Response Center, and 100% recall on historical tcpip.sys cases. Source: Microsoft Blog Rather than relying on a single model or prompt chain, MDASH operates as a multi-stage pipeline. Specialized agents handle scanning, debate, validation, deduplication, and exploitation separately. Microsoft says this architecture helps the system reason across multiple files, identify lifecycle and concurrency bugs, and validate whether a vulnerability is practically exploitable instead of merely theoretical. A major part of the announcement focused on the idea that future AI security tooling will depend less on raw model capability and more on orchestration systems built around models. Microsoft described MDASH as model-agnostic by design, allowing teams to swap or upgrade models while keeping the surrounding validation, proving, and workflow infrastructure intact. The release also prompted discussion about the operational risks of large-scale agentic security systems. In a LinkedIn thread, Sandesh KS wrote https://www.linkedin.com/feed/update/urn:li:activity:7460117180521639937/?dashCommentUrn=urn%3Ali%3Afsd comment%3A%287460117701701685248%2Curn%3Ali%3Aactivity%3A7460117180521639937%29&dashReplyUrn=urn%3Ali%3Afsd comment%3A%287460536830879424512%2Curn%3Ali%3Aactivity%3A7460117180521639937%29 : The orchestration layer is exactly where it gets interesting — and dangerous. When specialized agents start coordinating across identity systems, financial monitoring, and cloud infrastructure simultaneously, the blast radius of a single misconfigured permission boundary becomes enormous. The governance layer has to be designed before the agents go live, not retrofitted after the first incident. MDASH is currently being tested internally by Microsoft security teams and through a limited private preview with selected customers. The company says organizations interested in testing the system can apply through Microsoft Security’s preview program.