# How to Run Untrusted AI Agent Code Without Docker

> Source: <https://dev.to/toxsec/how-to-run-untrusted-ai-agent-code-without-docker-37k5>
> Published: 2026-05-29 13:37:20+00:00

Docker shares the host kernel. That was always the trade. It was fine when a human read the script before it ran. It stopped being fine the second an LLM started writing code at runtime off a prompt nobody pre-screened. So here's the practitioner version: what to actually run when your agent executes code you've never seen.

The review step is gone. A model writes a script, the script lives for milliseconds, then it executes. Could be a clean chart. Could be a curl-pipe-shell because a prompt injection rewired intent four hops upstream. You don't get to read it first.

And the container under it shares one kernel with every other workload on the box. CVE-2024-1086, a netfilter use-after-free, owns every container on the host once it pops. CISA confirmed active ransomware exploitation in late 2025, years after the patch. November 2025 dropped three more under runC (CVE-2025-31133, CVE-2025-52565, CVE-2025-52881), all bypassing maskedPaths through symlink races to write procfs gadgets. Own core_pattern and the kernel runs your binary on the next coredump, as root.

In March 2026, Oxford and the UK AISI shipped SandboxEscapeBench. Frontier models reliably escaped privileged containers, writable host mounts, and exposed Docker daemons on their own. Cost per attempt: roughly a dollar. The model does the recon, picks the CVE, hands back the shell. So the fix isn't a better Docker config. It's a different boundary.

If the code came from an untrusted prompt, it doesn't belong on a shared kernel. You want a hardware boundary.

**Firecracker** is what AWS runs Lambda and Fargate on. Each workload gets its own dedicated kernel in a microVM, boots in ~125ms, tiny hypervisor surface. Every kernel CVE that owns Docker stops dead at the hypervisor.

```
# easiest on-ramp: managed firecracker sandboxes
# E2B and Together Code Sandbox both run firecracker under the hood
pip install e2b-code-interpreter
# or stand up firecracker-containerd yourself if you want the metal
```

For the jailer config: seccomp on, drop all capabilities, run as a dedicated non-root jailer user, pin CPUs so a noisy neighbor doesn't melt throughput.

**Kata Containers** when you need OCI image compatibility. Wraps standard images in a per-workload microVM. Pair with QEMU or Cloud Hypervisor.

```
# pod spec
spec:
  runtimeClassName: kata-qemu   # per-workload microVM
  # never set hostNetwork: true
  # disable hostPath volumes
```

**gVisor** when the workload is compute-heavy and the input is trusted-ish. Modal runs it in prod for serverless GPU agents. The Sentry intercepts syscalls in userspace. It won't survive every kernel-tier exploit, but it kills the easy ones.

```
# run with the runsc runtime, kvm platform for speed
docker run --runtime=runsc --platform=kvm your-image
```

Isolation handles the local box. Egress handles exfil. Half the production sandboxes I audit ship allow-all outbound, which means a compromised agent phones home to C2 or smuggles tokens out a Markdown image tag and nobody notices.

Block everything by default. Allowlist only the endpoints the agent actually needs (the model API, the tool API). On Kata, attach the network namespace to a Cilium L7 policy that denies everything except those hosts. Tunneling, exfil, and callbacks all die at the wall when there is one.

Hardware isolation is the floor, not an excuse to run stale runC underneath it.

```
# fixed: 1.2.8, 1.3.3, or 1.4.0-rc.3
runc --version
# then enable user namespaces and DON'T map host root into the namespace
```

Most procfs gadget writes need root on the host. User namespaces take that away. The 1.1.x line is end of life and unpatched against the November CVEs, so if you're there, you're exposed.

Isolation fails silently. Detection tells you when. Deploy Falco or Sysdig Secure with a rule for procfs symlink creation (the runC escape signature), plus rules for agent-typical anomalies: outbound TCP to non-allowlisted hosts, writes to /etc/, processes spawning nc or socat.

```
- rule: Create Symlink Over Procfs Files
  desc: runC container escape via procfs symlink (CVE-2025-31133 / 52565)
  condition: create_symlink and evt.arg.target in ("/proc/sysrq-trigger","/proc/sys/kernel/core_pattern")
  priority: CRITICAL
```

Pipe critical alerts to a channel a human reads at 3am.

Docker default is not a sandbox for model-generated code from untrusted prompts. Firecracker or Kata for hostile input, gVisor for trusted-ish compute, default-deny egress on all of it, patched runC with user namespaces underneath, Falco watching. Ship that today and you've moved the boundary from "shared kernel" to "hardware."

I wrote the full breakdown including the autonomous ROME breakout and the system-prompt contract that hardens agents against instrumental convergence over on [the ToxSec Substack](https://www.toxsec.com/p/ai-sandbox-escape).

*ToxSec covers AI security vulnerabilities, attack chains, and the offensive tools defenders actually need to understand. Run by an AI Security Engineer with hands-on experience at the NSA, Amazon, and across the defense contracting sector. CISSP certified, M.S. in Cybersecurity Engineering.*
