How to Run Untrusted AI Agent Code Without Docker

A developer has outlined a method for running untrusted AI agent code without Docker, using hardware-level isolation via Firecracker microVMs, Kata Containers, or gVisor to avoid shared kernel vulnerabilities. The approach addresses the risk of prompt injections and kernel exploits like CVE-2024-1086 and runC CVEs, which have been shown to allow frontier models to escape containers autonomously. The solution also includes egress filtering, user namespace mapping, and Falco-based detection to prevent exfiltration and silent compromise.

Docker shares the host kernel. That was always the trade. It was fine when a human read the script before it ran. It stopped being fine the second an LLM started writing code at runtime off a prompt nobody pre-screened. So here's the practitioner version: what to actually run when your agent executes code you've never seen. The review step is gone. A model writes a script, the script lives for milliseconds, then it executes. Could be a clean chart. Could be a curl-pipe-shell because a prompt injection rewired intent four hops upstream. You don't get to read it first. And the container under it shares one kernel with every other workload on the box. CVE-2024-1086, a netfilter use-after-free, owns every container on the host once it pops. CISA confirmed active ransomware exploitation in late 2025, years after the patch. November 2025 dropped three more under runC CVE-2025-31133, CVE-2025-52565, CVE-2025-52881 , all bypassing maskedPaths through symlink races to write procfs gadgets. Own core pattern and the kernel runs your binary on the next coredump, as root. In March 2026, Oxford and the UK AISI shipped SandboxEscapeBench. Frontier models reliably escaped privileged containers, writable host mounts, and exposed Docker daemons on their own. Cost per attempt: roughly a dollar. The model does the recon, picks the CVE, hands back the shell. So the fix isn't a better Docker config. It's a different boundary. If the code came from an untrusted prompt, it doesn't belong on a shared kernel. You want a hardware boundary. Firecracker is what AWS runs Lambda and Fargate on. Each workload gets its own dedicated kernel in a microVM, boots in ~125ms, tiny hypervisor surface. Every kernel CVE that owns Docker stops dead at the hypervisor. easiest on-ramp: managed firecracker sandboxes E2B and Together Code Sandbox both run firecracker under the hood pip install e2b-code-interpreter or stand up firecracker-containerd yourself if you want the metal For the jailer config: seccomp on, drop all capabilities, run as a dedicated non-root jailer user, pin CPUs so a noisy neighbor doesn't melt throughput. Kata Containers when you need OCI image compatibility. Wraps standard images in a per-workload microVM. Pair with QEMU or Cloud Hypervisor. pod spec spec: runtimeClassName: kata-qemu per-workload microVM never set hostNetwork: true disable hostPath volumes gVisor when the workload is compute-heavy and the input is trusted-ish. Modal runs it in prod for serverless GPU agents. The Sentry intercepts syscalls in userspace. It won't survive every kernel-tier exploit, but it kills the easy ones. run with the runsc runtime, kvm platform for speed docker run --runtime=runsc --platform=kvm your-image Isolation handles the local box. Egress handles exfil. Half the production sandboxes I audit ship allow-all outbound, which means a compromised agent phones home to C2 or smuggles tokens out a Markdown image tag and nobody notices. Block everything by default. Allowlist only the endpoints the agent actually needs the model API, the tool API . On Kata, attach the network namespace to a Cilium L7 policy that denies everything except those hosts. Tunneling, exfil, and callbacks all die at the wall when there is one. Hardware isolation is the floor, not an excuse to run stale runC underneath it. fixed: 1.2.8, 1.3.3, or 1.4.0-rc.3 runc --version then enable user namespaces and DON'T map host root into the namespace Most procfs gadget writes need root on the host. User namespaces take that away. The 1.1.x line is end of life and unpatched against the November CVEs, so if you're there, you're exposed. Isolation fails silently. Detection tells you when. Deploy Falco or Sysdig Secure with a rule for procfs symlink creation the runC escape signature , plus rules for agent-typical anomalies: outbound TCP to non-allowlisted hosts, writes to /etc/, processes spawning nc or socat. - rule: Create Symlink Over Procfs Files desc: runC container escape via procfs symlink CVE-2025-31133 / 52565 condition: create symlink and evt.arg.target in "/proc/sysrq-trigger","/proc/sys/kernel/core pattern" priority: CRITICAL Pipe critical alerts to a channel a human reads at 3am. Docker default is not a sandbox for model-generated code from untrusted prompts. Firecracker or Kata for hostile input, gVisor for trusted-ish compute, default-deny egress on all of it, patched runC with user namespaces underneath, Falco watching. Ship that today and you've moved the boundary from "shared kernel" to "hardware." I wrote the full breakdown including the autonomous ROME breakout and the system-prompt contract that hardens agents against instrumental convergence over on the ToxSec Substack https://www.toxsec.com/p/ai-sandbox-escape . ToxSec covers AI security vulnerabilities, attack chains, and the offensive tools defenders actually need to understand. Run by an AI Security Engineer with hands-on experience at the NSA, Amazon, and across the defense contracting sector. CISSP certified, M.S. in Cybersecurity Engineering.