04:00
2026-06-16
arxiv.org
ai-safety
OSGuard: A Benchmark for Safety in Computer-Use Agents
Researchers introduced OSGuard, a benchmark suite for evaluating safety in computer-use agents under benign user instructions. The suite includes an action-level benchmark for local guardrail decisionβ¦