{"slug": "i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk", "title": "I Got Tired of AI Agents Having Root Access to Everything, So I Built XRisk", "summary": "A developer built XRisk, an open-source autonomous safety engine that acts as a deterministic decision layer between AI agents and real-world actions. XRisk evaluates proposed actions against policies, checking for prompt injection, secret leakage, and other risks, returning Allow, Confirm, or Block decisions. The project aims to prevent disastrous outcomes from AI agent errors by enforcing deterministic policy enforcement rather than relying on another LLM.", "body_md": "Everyone is building AI agents.\n\nVery few people are building the thing that sits between an AI agent and a disastrous decision.\n\nThat's why I built XRisk.\n\nXRisk is an open-source autonomous safety engine that acts as a decision layer between an AI agent and the real world.\n\nInstead of blindly executing an action, an agent asks XRisk:\n\n\"Should I actually do this?\"\n\nXRisk responds with one of three deterministic decisions:\n\n✅ Allow\n\n⚠️ Confirm\n\n❌ Block\n\nWhy I Started This Project\n\nAs I experimented with increasingly autonomous AI systems, I noticed the same pattern over and over again.\n\nMost projects focused on making agents more capable.\n\nAlmost nobody was asking:\n\n\"What happens when the agent is wrong?\"\n\nConsider a few examples.\n\nAn agent accidentally leaks API keys.\n\nA prompt injection convinces it to ignore previous instructions.\n\nA model decides to execute a shell command.\n\nAn autonomous workflow loops forever and keeps calling expensive APIs.\n\nA deployment bot pushes code without human approval.\n\nMost agent frameworks assume the model behaves.\n\nReality doesn't.\n\nI wanted something deterministic sitting between intention and execution.\n\nNot another model.\n\nNot another prompt.\n\nAn actual policy engine.\n\nWhat XRisk Does\n\nXRisk evaluates every proposed action before it's executed.\n\nIt combines multiple safety signals into a single explainable decision.\n\nSome of the things it checks include:\n\nPolicy-as-code with layered precedence\n\nPrompt injection detection\n\nSensitive data and secret detection\n\nCapability token validation\n\nNetwork egress restrictions\n\nCircuit breakers for autonomous loops\n\nTamper-evident audit logs\n\nSupply-chain verification\n\nPolicy conflict detection\n\nDeterministic forensic replay\n\nInstead of a mysterious \"Safety Score: 67%,\" XRisk explains why it made a decision.\n\nExample\n\nImagine an AI assistant wants to execute:\n\n{\n\n\"tool\": \"deploy\",\n\n\"actor\": \"release-bot\",\n\n\"prompt\": \"Deploy production immediately.\"\n\n}\n\nInstead of sending that directly to your deployment system...\n\nXRisk intercepts it.\n\nIt evaluates:\n\nDoes policy require approval?\n\nIs the actor allowed to deploy?\n\nIs the destination trusted?\n\nAre capability tokens valid?\n\nDoes this resemble prompt injection?\n\nIs this part of a dangerous execution loop?\n\nOnly then does it decide whether to:\n\nAllow\n\nConfirm\n\nBlock\n\nOne Design Decision I Feel Strongly About\n\nI deliberately avoided using another LLM to make safety decisions.\n\nLLMs are excellent at generating text.\n\nPolicy enforcement should be deterministic.\n\nIf an action is blocked, I want to know exactly why it was blocked.\n\nEvery decision should be reproducible.\n\nEvery audit should be explainable.\n\nEvery policy should be inspectable.\n\nThat's the philosophy behind XRisk.\n\nWhat's Next\n\nI'm currently working toward:\n\nThreat intelligence correlation\n\nZero-trust workload identities\n\nAutonomous containment\n\nAdversarial simulation\n\nMulti-party approval workflows\n\nThe long-term vision is to make XRisk a reusable security layer that can sit in front of any AI agent, regardless of framework.\n\nI'd Love Feedback\n\nThis project is still evolving, and I'd genuinely appreciate feedback from people building AI systems.\n\nSome questions I'm particularly interested in:\n\nWhat attack vectors am I missing?\n\nWhich policies would you want in production?\n\nWhat integrations would make this more useful?\n\nHow would you design a safety engine differently?\n\nIf you'd like to contribute, open an issue, suggest improvements, or submit a PR. Even small documentation fixes are welcome.\n\nThanks for reading—I hope XRisk becomes something that helps make AI systems not just more capable, but more trustworthy.", "url": "https://wpnews.pro/news/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk", "canonical_source": "https://dev.to/hootsworth/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk-11k5", "published_at": "2026-06-27 19:13:01+00:00", "updated_at": "2026-06-27 20:04:12.491507+00:00", "lang": "en", "topics": ["ai-safety", "ai-agents", "developer-tools"], "entities": ["XRisk"], "alternates": {"html": "https://wpnews.pro/news/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk", "markdown": "https://wpnews.pro/news/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk.md", "text": "https://wpnews.pro/news/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk.txt", "jsonld": "https://wpnews.pro/news/i-got-tired-of-ai-agents-having-root-access-to-everything-so-i-built-xrisk.jsonld"}}