cd /news/ai-agents/recoverable-failures-for-ai-coding-a… · home topics ai-agents article
[ARTICLE · art-37858] src=gist.github.com ↗ pub= topic=ai-agents verified=true sentiment=· neutral

Recoverable failures for AI coding agents

A developer proposes using Btrfs snapshots to make AI coding agent failures recoverable by treating agent work as filesystem transactions. The setup involves creating cheap snapshots before agent runs, allowing inspection and rollback of changes. This approach complements Git and trash-backed rm to protect against accidental deletes, overwrites, and generated damage.

read4 min views16 publishedJun 19, 2026

AI coding agents are useful precisely because they can run tools, edit many files, execute tests, install dependencies, and iterate quickly. That same ability makes them risky in YOLO mode: a mistaken command, broad glob, broken script, or overconfident refactor can damage a working tree faster than a human can react.

The goal is not to make agents harmless. The goal is to make common failures recoverable.

The proposed agentic setup has three layers:

Git commits       protect intentional source history
trash-backed rm   protects ordinary accidental deletes
Btrfs snapshots   protect deletes, overwrites, generated damage, and bad runs

These layers cover different failure modes. Git is excellent for source history, but it does not protect ignored files, untracked generated state, local config, or the repository metadata itself. Trash-backed rm

helps with deletion, but not with overwrites. Btrfs snapshots cover the whole subvolume state at a point in time.

This post focuses on the Btrfs snapshot layer: making bad AI-agent runs recoverable as filesystem transactions. The trash-backed rm

layer is a separate defense for accidental deletion; see Safe rm defaults for agent-heavy Linux machines.

Treat agent work as a controlled filesystem transaction:

  • create a cheap snapshot
  • let the agent work
  • inspect the result
  • keep it, diff it, or roll it back

This is the same basic idea behind several AI-agent sandbox approaches: give the agent real tools, but run those tools in a filesystem layer that can be inspected or discarded.

Examples of related work and discussion:

https://perevillega.com/posts/2026-03-03-ai-sandbox-coding-agents/https://github.com/mauro3/sandkastenhttps://www.agentfs.ai/https://news.ycombinator.com/item?id=47550282https://dev.to/alanwest/sandboxing-ai-agent-filesystems-containers-vs-virtual-fs-layers-ffe

The machine uses:

LVM logical volume
  btrfs filesystem (subvolid=5, flat layout)
    ext2_saved        ← btrfs-convert artifact, can be deleted once stable
    @agent_workflow

@agent_workflow

is the important part. It is a separate Btrfs subvolume mounted at:

/home/martin/bin/lib/agent_workflow

Keeping agent_workflow

as its own subvolume means it can be snapshotted and rolled back independently from the rest of $HOME

.

Verify the mount exactly, not just the nearest parent mount:

findmnt -rn -M /home/martin/bin/lib/agent_workflow
sudo btrfs subvolume show /home/martin/bin/lib/agent_workflow

This matters because findmnt --target

can return /

when the directory is not actually a mount point. The protected directory should show btrfs

, and btrfs subvolume show

should succeed.

We use Snapper on top of Btrfs:

sudo apt install btrfs-progs snapper
sudo snapper -c agent_workflow create-config /home/martin/bin/lib/agent_workflow
sudo chown martin:martin /home/martin/bin/lib/agent_workflow

Do not recursively chown

the whole subvolume after creating the Snapper configuration. Snapper keeps its metadata in .snapshots

, and that directory must remain owned by root. Changing the owner of .snapshots

makes snapshot creation fail with:

IO Error (.snapshots must have owner root).

Before an agent run:

PRE=$(sudo snapper -c agent_workflow create --print-number --description "before yolo agent run")

After a useful result:

POST=$(sudo snapper -c agent_workflow create --print-number --description "after successful agent run")

Inspect:

sudo snapper -c agent_workflow list
sudo snapper -c agent_workflow status PRE..POST
sudo snapper -c agent_workflow diff PRE..POST

If the current run is bad and no post-run snapshot was created, compare or undo against the live filesystem as snapshot 0

:

sudo snapper -c agent_workflow status "$PRE..0"
sudo snapper -c agent_workflow diff "$PRE..0"
sudo snapper -c agent_workflow undochange "$PRE..0"

If a post-run snapshot was created and the live filesystem still matches it, PRE..POST

is also usable:

sudo snapper -c agent_workflow undochange "$PRE..$POST"

In testing, undochange

restored deleted files, reverted overwritten files, and removed newly created files.

tools/agent-run

does the following:

  • verify it is running inside the protected agent_workflow

subvolume - create a Snapper snapshot

  • print the snapshot id
  • run the agent command
  • print the compare and rollback commands

The CLI refuses to run if the snapshot cannot be created. That matters: the safety mechanism has to be automatic, because YOLO mode is exactly when humans are least likely to remember manual precautions.

The mount check uses findmnt -rn -T "$PWD"

against the nearest mount, then asserts that the target is /home/martin/bin/lib/agent_workflow

and the filesystem type is btrfs

.

Example:

cd /home/martin/bin/lib/agent_workflow
agent-run claude --dangerously-skip-permissions

On a bad run, roll back with the commands printed at exit:

sudo snapper -c agent_workflow undochange 3..0

Remaining risks:

  • network exfiltration
  • writes outside the protected subvolume
  • credential access
  • destructive commands run with elevated privileges
  • snapshot deletion by a process with enough permission

URL: https://gist.github.com/monperrus/a7aa344dc84c76e5ec569a646b31eab9

── more in #ai-agents 4 stories · sorted by recency
── more on @btrfs 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/recoverable-failures…] indexed:0 read:4min 2026-06-19 ·