BoxAgnts Runtime (4) — Capability Security, Not Root Access

BoxAgnts has implemented a capability-based security model for AI agents that replaces root-level access with granular, explicit permissions. The system's WASM execution model requires agents to declare specific capabilities—such as limited filesystem access, tool restrictions, and turn caps—rather than granting implicit authority over the entire system. This approach treats LLMs as untrusted execution authorities that require containment boundaries, addressing the fundamental security architecture problem of giving probabilistic models unrestricted operational privileges.

Modern AI agents are rapidly gaining operational authority—executing shell commands, modifying repositories, accessing local files, operating cloud infrastructure, managing developer environments. The problem is that most AI infrastructure still relies on a security model designed for trusted human operators. That assumption no longer holds. LLMs are not trustworthy execution authorities. They are probabilistic systems exposed to prompt injection, adversarial context, untrusted documents, manipulated tool outputs, and reasoning instability. Yet many AI agents still run with privileges equivalent to root. This isn't a tooling problem—it's a security architecture problem. BoxAgnts' query loop clearly demonstrates how LLMs become runtime controllers—the model decides which tool to call, what arguments to pass, what resources to access. In boxagnts/query/src/query.rs : // Each turn, the model's generated content is parsed. // If it contains tool use blocks, the system executes the corresponding tools. for tool use block in tool uses { let tool name = &tool use block.name; let tool = find tool &tools, tool name ; let result = tool.execute tool input, tool ctx .await; // Result is fed back to the model as a ToolResult message } The key issue is that runtimes typically grant the model overly broad implicit authority—unrestricted filesystem, unrestricted network, unrestricted shell. An LLM doesn't understand operational risk, privilege escalation, production safety, or organizational boundaries—it only predicts plausible continuations. Malicious instructions can be embedded in webpages, Markdown files, source code, emails, PDFs, and API responses—the model cannot reliably distinguish "trusted instruction" from "malicious instruction" through prompts alone. So the core question isn't "Can the model behave safely sometimes?"—it's that unrestricted permissions amplify every reasoning failure. The goal isn't to make the model trustworthy; it's to make unsafe behavior containable. This requires capability boundaries. BoxAgnts' Agent tool design boxagnts/tools/src/agent/mod.rs embodies this principle. An agent can be configured with tools restrictions—only a specific tool set; can set max turns hard caps; can choose isolation: "worktree" to run in an isolated Git worktree. These are all instances of capability constraints: derive Debug, Deserialize struct AgentInput { description: String, prompt: String, tools: Option<Vec<String , // Limit sub-agent's available tools max turns: Option<u32 , // Hard turn cap isolation: Option<String , // Isolation mode worktree model: Option<String , // Model restriction run in background: bool, // Async isolation } RBAC, ACL, IAM—these identity-based security models assume stable identities, predictable workflows, and human operators. AI agents violate all three—dynamically generating workflows, probabilistically invoking tools, coordinating across multiple agents. BoxAgnts' PermissionMode configuration offers a more flexible approach: pub enum PermissionMode { BypassPermissions, // Skip permission checks not recommended for production Default, // Standard permission checks AcceptEdits, // Auto-accept edit operations Plan, // Planning mode read-only } But even this model isn't granular enough. What's really needed is a precise description like "Agent can read /workspace/project, write /workspace/tmp, cannot access ~/.ssh, cannot access production secrets." The core idea of capability security is simple: don't give the agent the root password—give it a precise permission list. BoxAgnts' WASM execution model is the engineering implementation of this idea. In RunOption , every capability is explicitly declared: work dir → Filesystem capability: only expose specified directories allowed outbound hosts → Network capability: allowlist-style outbound connections env vars → Environment capability: selectively pass environment variables wasm timeout → Time capability: time-limited execution wasm max memory size → Memory capability: hard memory ceiling wasm fuel → Compute capability: instruction count limit The network-level capability control is especially fine-grained. Look at boxagnts/wasm-sandbox/src/extension/net.rs : // Outbound connection check pub async fn socket addr check addr: SocketAddr, addr use: SocketAddrUse, allowed outbound hosts: OutboundAllowedHosts, blocked networks: BlockedNetworks, - bool { // TCP bind? Denied // UDP bind? Denied // Outbound connection? Check allowlist and blocklist } The model cannot override these constraints —no matter how the LLM "reasons" in prompts, the WASM sandbox's TCP bind always returns false. This is the core advantage of capability security: safety doesn't depend on model intent; it depends on runtime enforcement. Traditional operating systems evolved around trusted human users—humans have contextual understanding, organizational awareness, long-term reasoning, and accountability. LLMs have none of these. They cannot consistently evaluate whether a file is sensitive, whether a command is dangerous, or whether an API call violates policy. That's why capability security fits AI better than RBAC: it cuts dependency on model judgment. It's not about expecting the agent to make correct decisions—it's about ensuring the runtime constrains possible decisions. Security should not depend on model alignment; it should depend on runtime guarantees. BoxAgnts' ToolContext contains all the elements of this design awareness: pub struct ToolContext { pub permission mode: PermissionMode, pub session id: Option<String , pub current turn: Arc<AtomicUsize , pub non interactive: bool, pub mcp manager: Option<Arc<boxagnts mcp::McpManager , pub config: Config, pub allowed outbound hosts: Vec<String , pub block url: Option<String , } Every tool execution carries this context. Note that allowed outbound hosts and block url aren't suggestions—they are hard constraints passed to the WASM runtime. In BoxAgnts' Managed Agent mode, the Manager distributes tasks to multiple Executors. Each Executor can have different capability sets, different models, different tool access. In boxagnts/query/src/managed orchestrator.rs , the system prompt explicitly defines this layering: You are the MANAGER, responsible for the planning and reasoning layer. You cannot directly use file/bash tools—you must delegate to executor agents. Each executor uses model {executor model}, with at most {max turns} turns. At most {max concurrent} executors run in parallel. This layering itself is capability security—the Manager's capability is "delegation"; the Executor's capability is "execution." The Manager won't accidentally execute dangerous shell commands because it simply doesn't have shell tools. As AI system complexity grows, capabilities themselves may become orchestratable resources. Future runtimes may manage capability delegation, temporary permissions, capability revocation, execution tracing, capability inheritance, and resource accounting. BoxAgnts' current architecture already leaves room for this extension: ToolContext , as the context carrier for every tool execution, can naturally expand into a "capability context"—carrying not just the current agent's permission set, but also inheritance chains, delegation relationships, and audit logs. AI agents are evolving from conversational systems into execution systems. This shift fundamentally changes security requirements. LLMs are inherently exposed to adversarial instructions, untrusted context, and probabilistic execution paths—as long as they run with broad implicit permissions, they remain structurally unsafe. The solution isn't better prompts—it's runtime-enforced capability isolation. BoxAgnts' practice demonstrates that capability-driven runtimes provide constrained execution, explicit permissions, deterministic boundaries, and governable infrastructure. AI agents should receive capabilities, not root access.