{"slug": "scoring-ai-hackers-when-there-is-no-answer-key", "title": "Scoring AI hackers when there is no answer key", "summary": "AI models are saturating existing offensive-cyber benchmarks, which often rely on bugs with public writeups, limiting their ability to differentiate top systems. The AI security lab Irregular introduced FrontierCyber, a benchmark designed to score AI hackers without a fixed answer key, addressing the problem of benchmarks becoming obsolete once models solve most tests.", "body_md": "AI models are solving more and more of the offensive-cyber tests built to measure them. Once a model solves most of a benchmark, that benchmark runs out of room and says little about the best systems anymore. Many of those tests also lean on bugs that already have public writeups, so a strong score can come partly from a model repeating something it has read. FrontierCyber, a benchmark from the AI security lab Irregular, goes … [More ](https://www.helpnetsecurity.com/2026/06/25/ai-offensive-cyber-evaluations-benchmark/)\n\nThe post [Scoring AI hackers when there is no answer key](https://www.helpnetsecurity.com/2026/06/25/ai-offensive-cyber-evaluations-benchmark/) appeared first on [Help Net Security](https://www.helpnetsecurity.com).", "url": "https://wpnews.pro/news/scoring-ai-hackers-when-there-is-no-answer-key", "canonical_source": "https://www.helpnetsecurity.com/2026/06/25/ai-offensive-cyber-evaluations-benchmark/", "published_at": "2026-06-25 05:30:01+00:00", "updated_at": "2026-06-25 05:45:30.355994+00:00", "lang": "en", "topics": ["ai-safety", "ai-research", "ai-products"], "entities": ["Irregular", "FrontierCyber", "Help Net Security"], "alternates": {"html": "https://wpnews.pro/news/scoring-ai-hackers-when-there-is-no-answer-key", "markdown": "https://wpnews.pro/news/scoring-ai-hackers-when-there-is-no-answer-key.md", "text": "https://wpnews.pro/news/scoring-ai-hackers-when-there-is-no-answer-key.txt", "jsonld": "https://wpnews.pro/news/scoring-ai-hackers-when-there-is-no-answer-key.jsonld"}}