16:50
2026-06-18
lesswrong.com
ai-safety
GDM AI Control Roadmap
GDM published an AI Control Roadmap outlining internal guardrails to detect and prevent adversarial behavior by AI agents. The roadmap includes threat modeling, control invariants, capability-based miโฆ