{"slug": "show-hn-drydock-vm-sandboxes-for-macos-autonomous-coding-agents", "title": "Show HN: Drydock – VM Sandboxes for macOS Autonomous Coding Agents", "summary": "Drydock, a new open-source tool, runs autonomous coding agents like Claude Code and OpenAI Codex in hardware-isolated VMs on macOS, preventing compromised agents from accessing API keys, filesystems, or the internet. The alpha release (v0.1.4) requires macOS 26+ on Apple silicon and uses a deny-by-default egress policy with short-lived, budget-capped tokens. The project is single-maintainer and has not undergone a third-party security audit.", "body_md": "drydock runs autonomous coding agents (**Claude Code** or **OpenAI Codex**,\nper-task selectable) on **your own Mac** — not someone's cloud — each task\nsealed in its own **hardware-isolated VM**. It starts from the assumption that\nthe agent is already compromised: your real API key **never enters the sandbox**\n(a host-side gateway hands it short-lived, budget-capped tokens), egress is\n**deny-by-default**, and the only thing that crosses back out is a `git diff`\n\nyou approve before it reaches origin.\n\nMost agent tooling tries to keep the agent *well-behaved* — permission\nprompts, output filters, policy. drydock takes the opposite stance: **contain\nthe blast radius** so a hostile agent (a poisoned repo, a malicious\ndependency, a prompt-injection that turns a fetched URL into a shell command)\ncan't reach your key, your filesystem, your push credentials, or the open\ninternet — regardless of what it tries.\n\nStatus: working alpha (v0.1.4).The full task lifecycle works end-to-end — submit → isolated VM → gated diff → push — and drydock ships through a Homebrew tap. It is pre-1.0 and single-maintainer: only`main`\n\nis supported, behavior and config can change between minor versions, and it hasn't been hardened by real-world use.There has been no third-party security audit— the security model is written down in detail in the[threat model], so read that and decide for yourself before trusting it.Hard requirement: macOS 26+ on Apple silicon— it runs on Apple's`container`\n\nruntime (itself 1.0), so it won't run anywhere else.\n\nSecurity claims: [ THREAT_MODEL.md](/sricola/drydock/blob/main/THREAT_MODEL.md).\n\nWebsite:\n\n[https://sricola.github.io/drydock/](https://sricola.github.io/drydock/)\n\n```\n# Prerequisites (anything you don't already have)\nbrew install --cask container\nbrew install squid\n```\n\nThe PR/MR adapters call `gh`\n\n, `glab`\n\n, or `tea`\n\n— install whichever your\nrepos use, and run their respective `auth login`\n\nbefore submitting a task.\n\n```\nbrew tap sricola/drydock\nbrew trust sricola/drydock     # personal taps require explicit trust\nbrew install drydock\ndrydock init\n```\n\nPulls a pre-built Apple-silicon binary from the latest tagged release\n(currently `v0.1.4`\n\n); no Go toolchain required.\n\n```\nbrew install go\ngit clone https://github.com/sricola/drydock && cd drydock\nmake install                             # PREFIX=/usr/local by default\nmake install PREFIX=$HOME/.local         # …or a user-owned prefix\ndrydock init\n```\n\nEither way, `drydock init`\n\nwalks the remaining prereqs (container\nservice, `drydock-egress`\n\nnetwork, sandbox + anchor images) and reports\nper-step status. Idempotent — re-run any time.\n\nAt least one vendor key is required. Both are host-only — they never go to disk and never enter the VM:\n\n```\nexport ANTHROPIC_API_KEY=sk-ant-...   # required for Claude Code tasks\nexport OPENAI_API_KEY=sk-...          # required for Codex tasks\ndrydock start              # foreground; ^C to stop. backgrounds via & or your launchd plist.\n```\n\nQuick liveness:\n\n```\ndrydock status\n# brokerd     up\n# in flight   0 running · 0 awaiting egress · 0 awaiting diff · 0 pushing\n# tasks       0 total · 0 in last 24h\n# audit dir   ~/.drydock/audit\n```\n\nFirst time? Walk through [ examples/hello-task.md](/sricola/drydock/blob/main/examples/hello-task.md) —\na copy-paste first task that exercises every layer, fits inside the\ndefault budget, and tells you exactly what each step proves.\n\nIn one shell, fire the task. **It blocks until the agent runs and you\napprove the diff** (typical: a few seconds to a few minutes, plus your\nreview time):\n\n```\ndrydock submit \\\n  --repo git@github.com:your-org/your-repo \\\n  --instruction \"Add a one-line comment to README.md explaining the project.\"\n```\n\nA macOS notification fires when the diff is ready. In another shell:\n\n```\ndrydock pending               # awaiting tasks (egress + diff gates both shown)\ndrydock review <id>           # diff in $PAGER, then prompt y/N — the one-shot path\n                              # ─ or, step by step ─\nless ~/.drydock/audit/<id>.diff\ndrydock approve <id>          # … or: drydock deny <id>\n```\n\nThe submit shell unblocks with the push outcome:\n\n```\ntask ab12cd34: pushed agent/ab12cd34 (github)\ndrydock status                # brokerd up?, breakdown (running · egress · diff · pushing)\ndrydock tasks                 # recent runs: id, age, duration, cost, outcome\ndrydock logs <id> [-f]        # stream-json audit (use -f to follow)\ndrydock kill <id>             # cancel the in-flight task (VM down + gate unblocked)\ndrydock doctor                # smoke-test the sandbox setup (no API spend)\n# Use OpenAI Codex instead of Claude Code for this task\ndrydock submit --repo … --instruction \"…\" --agent codex\n\n# Long prompt from a file\ndrydock submit --repo … --instruction-file ./task.md\n\n# Pipe from stdin\necho \"Refactor the egress compiler\" | drydock submit --repo … -\n\n# Pick a specific model (overrides default_model in config)\ndrydock submit --repo … --instruction \"…\" --model claude-sonnet-4-6\n\n# Skip the approval gate (trusted batch run; see threat model)\ndrydock submit --repo … --instruction \"…\" --auto-approve\n\n# Request additional egress (host:port[,port], repeatable; gated)\ndrydock submit --repo … --instruction \"…\" \\\n  --egress-extra internal.example.com:443 \\\n  --egress-extra files.example.com:443,8443\n\n# Scripting — emit the raw response shape\ndrydock submit --repo … --instruction \"…\" --json | jq .branch\n```\n\nIf you'd rather hit the HTTP API directly:\n\n```\nSOCK=$TMPDIR/drydock-$(id -u)/drydock.sock\ncurl --unix-socket \"$SOCK\" http://_/tasks \\\n  -H 'content-type: application/json' \\\n  -d '{ \"repo_ref\": \"git@github.com:o/r\", \"instruction\": \"...\" }'\n```\n\nNotifications opt-out: `DRYDOCK_NO_NOTIFY=1`\n\n.\n\n`repo_ref`\n\nmust be a git URL (`https://`\n\n, `git@`\n\n, or `ssh://`\n\n); local\npaths are rejected because adapters can't operate on filesystem origins.\nThe PR/MR adapter is chosen by `platform`\n\n:\n\n`\"platform\": \"github\"`\n\n→`gh pr create --head <branch> --fill`\n\n(needs`gh`\n\nauthed)`\"platform\": \"gitlab\"`\n\n→`glab mr create --fill --yes`\n\n(needs`glab`\n\nauthed)`\"platform\": \"gitea\"`\n\n(alias`forgejo`\n\n) →`tea pr create --head <branch>`\n\n(needs`tea`\n\nauthed)`\"platform\": \"none\"`\n\n→ push only; no PR/MR*omitted*→ hostname autodetect (`github.com`\n\n,`gitlab.com`\n\n,`gitea.com`\n\n/`codeberg.org`\n\n; else push-only — covers Bitbucket and other self-hosted)\n\nSelf-hosted GitLab and Gitea need explicit `\"platform\"`\n\n. Bitbucket has no\nwidely-adopted CLI to wrap and falls back to push-only; contributions\nwelcome. The push response includes `\"platform\"`\n\nso the caller can see\nwhich adapter ran. `\"auto_approve\": true`\n\nskips the gate — see the threat\nmodel before using it.\n\n`drydock init`\n\ncreates `~/.drydock/`\n\nat mode `0700`\n\nand seeds two files:\n\n| Path | What |\n|---|---|\n`~/.drydock/config.yaml` |\nOperator settings (network name, gateway IP, per-task budget + timeout, max concurrent tasks, paths, broker listener, behavior flags) |\n`~/.drydock/egress.yaml` |\nSquid + gateway allowlist (hosts and ports the sandbox may reach) |\n\nBoth files are seeded from defaults the first time; `drydock init`\n\nnever\noverwrites them. Env vars still win over file values (e.g.\n`BROKER_ADDR=…`\n\nin the shell overrides `broker.addr`\n\nin the YAML), so\nexisting scripts keep working. `ANTHROPIC_API_KEY`\n\nis intentionally\n**not** in either file — by design, it never goes to disk.\n\n`~/.drydock/egress.yaml`\n\nis the source of truth (seed template lives at\n`$HOMEBREW_PREFIX/share/drydock/config/egress.yaml`\n\n). The default:\n\n```\ndefault:\n  domains:\n    - { host: api.anthropic.com,      ports: [443] }   # routed via gateway\n    # JavaScript\n    - { host: registry.npmjs.org,     ports: [443] }   # routed via squid\n    # Python\n    - { host: pypi.org,               ports: [443] }   # routed via squid\n    - { host: files.pythonhosted.org, ports: [443] }   # routed via squid\n    # Go module ecosystem\n    - { host: proxy.golang.org,       ports: [443] }   # routed via squid\n    - { host: sum.golang.org,         ports: [443] }   # routed via squid\nper_task_widening:\n  requires_approval: true\n```\n\nThe sandbox image ships **Node 22, Python 3.11, and Go 1.26** so JS,\nPython, and Go tasks work without operator customization. Other\ntoolchains can be added by extending `image/Dockerfile`\n\nand rebuilding\nvia `make image`\n\n(or `drydock init`\n\n, which detects stale images and\nrebuilds).\n\n`api.anthropic.com`\n\nis intentionally excluded from the squid allowlist —\nit routes through the credential gateway, not the proxy. Per-task widening\nvia `egress_extra`\n\ngoes through the same human-driven gate as the diff\npush (when `per_task_widening.requires_approval: true`\n\n, which is the\ndefault): brokerd blocks the request, writes the requested hosts to\n`AUDIT_ROOT/<id>.widen.json`\n\n, and shows the task in `drydock pending`\n\nunder gate `egress`\n\n. Approve with `drydock approve <id>`\n\nonce you've\nreviewed the request. Restart brokerd after editing the default\nallowlist.\n\nThe canonical location is `~/.drydock/config.yaml`\n\n— seeded by `drydock init`\n\nwith the defaults below as a commented template. Edit and re-run\n`drydock start`\n\n. Env vars still override file values for ops/scripting:\n\nField (`config.yaml` ) |\nEnv override | Default | Meaning |\n|---|---|---|---|\n| — | `ANTHROPIC_API_KEY` |\n(at least one required) |\nReal Anthropic key; host-only, never goes to disk |\n| — | `OPENAI_API_KEY` |\n(at least one required) |\nReal OpenAI key; host-only, never goes to disk |\n`default_agent` |\n`DRYDOCK_DEFAULT_AGENT` |\n`claude` |\nAgent to use when `--agent` is not passed; allowed values: `claude` | `codex` |\n`network` |\n`DRYDOCK_NETWORK` |\n`drydock-egress` |\nvmnet network name |\n`gateway_ip` |\n`DRYDOCK_GW_IP` |\n`192.168.66.1` |\ngateway + squid bind here |\n`sandbox_image` |\n`SANDBOX_IMAGE` |\n`drydock-sandbox:latest` |\nper-task agent VM image |\n`anchor_image` |\n`DRYDOCK_ANCHOR_IMAGE` |\n`drydock-anchor:latest` |\nminimal sleep-forever image holding the vmnet gateway IP |\n`task_budget_usd` |\n`DRYDOCK_TASK_BUDGET_USD` |\n`2.0` |\nper-task USD ceiling |\n`max_concurrent_tasks` |\n`DRYDOCK_MAX_CONCURRENT_TASKS` |\n`2` |\nexcess POSTs to `/tasks` get HTTP 503 |\n`task_timeout` |\n— | `30m` |\nwall-clock per task |\n`default_model` |\n`DRYDOCK_DEFAULT_MODEL` |\n(empty) |\n`claude --model` fallback for tasks that don't pass `--model` ; empty = claude picks |\n`stage_root` / `audit_root` / `squid_run_dir` |\n`STAGE_ROOT` / `AUDIT_ROOT` / `SQUID_RUN_DIR` |\n`~/.drydock/{stage,audit,squid}` |\nper-task scratch (audit dir is `0700` , audit log + diff are `0600` ). Pre-v0.1.4 used `/tmp/broker/` ; `drydock tasks` and friends still surface that history while it exists. |\n`broker.socket` |\n`BROKER_SOCKET` |\n`$TMPDIR/drydock-$UID/drydock.sock` |\nUnix socket (per-user parent dir at `0700` , socket at `0600` ) |\n`broker.addr` |\n`BROKER_ADDR` |\n(empty) |\nset `host:port` to expose over TCP (warns at boot — no auth; see\n|\n`notifications` |\n`DRYDOCK_NO_NOTIFY=1` (off) |\n`true` |\nmacOS notifications on pending approval |\n`log_json` |\n`DRYDOCK_LOG_JSON=1` |\n`false` |\nforce JSON logs even on a TTY (default: terse text on TTY, JSON otherwise) |\n`strict_container_version` |\n`DRYDOCK_STRICT_CONTAINER_VERSION=1` |\n`false` |\nfail closed when `container` major drifts from the tested range |\n| — | `EGRESS_CONFIG` |\n`~/.drydock/egress.yaml` |\npath override for the egress YAML |\n\nGateway port `8088`\n\nand squid port `3128`\n\nare hard-coded in\n`cmd/brokerd/main.go`\n\nand `image/entrypoint.sh`\n\n; change both together.\n\n| Symptom | First place to look |\n|---|---|\n`192.168.66.1 never became bindable` |\n`container ls -a` (anchor running?), `container network inspect drydock-egress` (gateway IP?) |\nImage build fails on `npm install` |\nTransient registry timeout; rerun `container build` |\n| Squid CONNECT 403 to an expected host | `cat ~/.drydock/squid/squid-allow.txt` ; add via `egress.yaml` or `egress_extra` |\n| Stale anchor after a crash | `container rm -f drydock-anchor` ; next brokerd start does this for you |\n| Gateway 401 | Key wrong or placeholder (`sk-ant-fake` is expected to 401) |\n| VM reaches a host it shouldn't | Confirm `init-firewall.sh` ran inside the VM — overriding `--entrypoint` skips it |\n\nPer-task stream-json from the agent lands in `$AUDIT_ROOT/<id>.jsonl`\n\n; the\ndiff lands in `$AUDIT_ROOT/<id>.diff`\n\n.\n\n```\ncmd/brokerd/      # broker daemon\ncmd/drydock/      # operator CLI (init|start|submit|status|tasks|pending|review|approve|deny|kill|logs)\ninternal/\n  broker/         # /tasks + admin handlers, approval + egress gates, concurrency, cancellation\n  creds/          # Grant/Provider interfaces\n  egress/         # YAML loader + allowlist compilation + host/port validation\n  gateway/        # credential gateway (mint/serve/account/revoke), constant-time token check\n  netfw/          # squid conf + allowlist compiler\n  remote/         # PR/MR adapters: github (gh), gitlab (glab), gitea (tea), push-only\n  runner/         # `container run` argv builder\n  sockpath/       # shared per-uid socket path discovery for brokerd + CLI\n  stage/          # work tree, host-side commit + push, curated adapter env\nimage/            # drydock-sandbox: Dockerfile + entrypoint.sh + init-firewall.sh\nimage/anchor/     # drydock-anchor: FROM scratch + static Go sleep binary\ntests/integration # //go:build integration — boots brokerd against real container CLI\nconfig/           # egress.yaml\nsite/             # narrative explainer + launch post\ndocs/superpowers/ # historical design specs\nLICENSE           # MIT\nSECURITY.md       # how to report a security bug + documented residuals\nTHREAT_MODEL.md   # what drydock defends — and doesn't\nMakefile          # build, install, test, test-integration, image, image-anchor, network, init, clean\nmake build              # bin/brokerd, bin/drydock\nmake test               # go test -race ./...\nmake image              # both images\nmake image-sandbox      # per-task agent image\nmake image-anchor       # minimal anchor image (FROM scratch + static binary)\nmake test-integration   # boot brokerd as subprocess; macOS only, needs container runtime\n```\n\nGitHub Actions runs `go build`\n\n, `go test -race`\n\n, and `go vet`\n\non every\npush/PR. Integration (`make test-integration`\n\n) requires `container`\n\nand\nis macOS-only — runs locally, not in CI. No real Anthropic spend.\n\n- Pricing in\n`internal/gateway/pricing.go`\n\ncovers the 4.x families (Opus, Sonnet, Haiku) with an Opus-priced default fallback; bump when Anthropic publishes new rates. - Audit dir (\n`~/.drydock/audit/`\n\n) grows unbounded — old`<id>.{jsonl,diff}`\n\nfiles aren't pruned. No`drydock tasks --prune`\n\nyet. - Up to\n`DRYDOCK_MAX_CONCURRENT_TASKS`\n\ntasks in flight per brokerd (default 2); raise on bigger hardware. - No Slack/web approval adapters yet — only the local CLI + macOS notifications.\n- Bitbucket PR/MR opening: push-only fallback (no widely-adopted CLI to wrap). Contribution slot.\n- Apple\n`container`\n\nis v1.0; flag drift is the most likely breakage source.`DRYDOCK_STRICT_CONTAINER_VERSION=1`\n\nfails closed on drift.", "url": "https://wpnews.pro/news/show-hn-drydock-vm-sandboxes-for-macos-autonomous-coding-agents", "canonical_source": "https://github.com/sricola/drydock", "published_at": "2026-06-19 00:13:40+00:00", "updated_at": "2026-06-19 00:31:36.935388+00:00", "lang": "en", "topics": ["ai-agents", "ai-safety", "developer-tools", "ai-tools"], "entities": ["Drydock", "Claude Code", "OpenAI Codex", "Apple", "Homebrew", "GitHub", "Squid"], "alternates": {"html": "https://wpnews.pro/news/show-hn-drydock-vm-sandboxes-for-macos-autonomous-coding-agents", "markdown": "https://wpnews.pro/news/show-hn-drydock-vm-sandboxes-for-macos-autonomous-coding-agents.md", "text": "https://wpnews.pro/news/show-hn-drydock-vm-sandboxes-for-macos-autonomous-coding-agents.txt", "jsonld": "https://wpnews.pro/news/show-hn-drydock-vm-sandboxes-for-macos-autonomous-coding-agents.jsonld"}}