cd/entity/CVE-Benchยท homeโ€บ entitiesโ€บ CVE-Bench
grep -l @cve-bench /news/*.json | wc -l โ†’ 1

@CVE-Bench

mentions 1 type Organization feed RSS
19:28
2026-05-29
giovannigatti.github.io
ai-safety

CVE-Bench: testing LLM agents on real-world vulnerability patches

Researchers evaluated five frontier AI models (three from OpenAI, two from Poolside) on fixing 20 real-world Common Vulnerabilities and Exposures (CVEs) across three prompt conditions, finding that noโ€ฆ

// co-occurs with top 5 entities