# Write returned success. The file was never there.

> Source: <https://dev.to/mrvlad/write-returned-success-the-file-was-never-there-12h4>
> Published: 2026-06-25 13:47:57+00:00

Four issues filed in the past week describe the same failure: an agent writes to persistent storage, the write API returns without error, and the data is gone. No exception, no log entry, no indication that anything went wrong until something tries to read what was written.

The symptoms vary. In one case, a Write tool call reports success while a concurrent disk check from a separate process shows nothing written. In another, 28 concurrent agent workflows report `started=1, result=0`

in their journals with no abort marker. In a third, two processes writing to the same data directory produce 157 GB of growth and a kernel panic. The corruption accumulated silently over days before the system failed. In a fourth, a memory layer agent skips writes entirely or writes partial records. The store fills with fragments no future session can act on.

The failure is structural. A write that looks atomic to the caller is not atomic to the filesystem when multiple processes share state. The write API returns when the calling process hands off to the OS or a downstream layer, not when durability is confirmed across all concurrent writers. If two writers race on the same file, one loses. If a shared runtime dies mid-flight, in-progress writes evaporate. The caller gets no signal either way.

What makes this hard to debug is where the evidence lands. The write site looks clean. The gap shows up at the read site: a future session, a downstream consumer, or a human checking disk from outside the agent's process. By then, the causal chain is several hops from where the failure occurred.

Closing this class requires three things.

Writes to shared state need to go through a coordination layer that enforces at-most-one-writer semantics. File locks, atomic renames, or a mediating coordinator all work. The mechanism matters less than the invariant: concurrent writes to the same artifact are serialized, not raced.

That coordination layer needs to sit in the critical path of the write. If the agent can bypass it, the invariant breaks under concurrent load.

And failures need to surface at the write site, not the read site. A write that cannot be confirmed as durable should return an error to the caller. A write that silently succeeds but leaves nothing behind is a lie the next session has to investigate.

None of this is new. Distributed databases and cache coherence protocols solved this class decades ago. What's changed is that multi-agent systems are hitting it at the filesystem and plugin layer, where the coordination primitives are still thin.

We built agent-coherence to address this for the AI agent case. The coordinator enforces single-writer invariants across concurrent sessions and surfaces write failures at the call site instead of the read site.

Library at github.com/hipvlady/agent-coherence, with adapters for LangGraph, CrewAI, and Claude Code workflows.
