{"slug": "verified-detection-and-prevention-of-concurrency-anomalies-in-multi-agent-large", "title": "Verified Detection and Prevention of Concurrency Anomalies in Multi-Agent Large Language Model Systems", "summary": "Researchers formalized and verified four concurrency anomalies in multi-agent LLM systems, establishing a machine-checked consistency hierarchy from L0 to L4. They implemented verified detectors and runtimes in Rust, preventing anomalies across three model families and reproducing a silent lost update in ByteDance's deer-flow. The work provides the first mechanically verified consistency model for shared-state multi-agent LLM runtimes.", "body_md": "arXiv:2606.17182v1 Announce Type: new\nAbstract: Multi-agent LLM systems share state through memory stores, vector indices, and tool registries. We model such sharing as long-running read-generate-write operations under deterministic-generation semantics -- the regime durable-execution engines enforce by deterministic replay -- and formalize four concurrency anomalies in TLA+: stale-generation, phantom-tool, causal-cascade, and tool-effect reordering, structural analogues of classical isolation anomalies, each with a TLC counter-example. The exclusion lattice over these anomalies is trivial; the contribution is the mechanically verified realizability and strict separation of one maximal chain within it, $L_0 \\subsetneq \\cdots \\subsetneq L_4$, to our knowledge the first machine-checked consistency hierarchy for such runtimes. A development of 274 Verus obligations (zero assume, zero admit; trust base: two structural axioms and a mutex correspondence) proves the detectors sound and complete against the specifications and each runtime its avoidance set. Three deployed Rust runtimes realize L0-L1 (pessimistic locking, serializable snapshot isolation, default-SI), each verified against stale-generation and refined to its state machine; L2-L4 are exec-mode-verified with dependency-free prevention twins (A3, A6, A2: 0/1000 versus 1000/1000), and L2 is run live across three model families (A3 prevented in all 120 retracted sessions). We reproduce a silent lost update in ByteDance's deer-flow, formalizing its fix as a verified $L_0 \\to L_1$ refinement, and exhibit tool-effect reordering in LangGraph's ToolNode on unmodified output, removed by an L3 commit-order sequencer. The verified detector, refinements, and realizability artifacts are the contribution; the phenomena and lattice are classical.", "url": "https://wpnews.pro/news/verified-detection-and-prevention-of-concurrency-anomalies-in-multi-agent-large", "canonical_source": "https://arxiv.org/abs/2606.17182", "published_at": "2026-06-17 04:00:00+00:00", "updated_at": "2026-06-17 04:31:04.103349+00:00", "lang": "en", "topics": ["large-language-models", "ai-agents", "ai-safety", "ai-research"], "entities": ["ByteDance", "LangGraph", "TLA+", "TLC", "Rust", "deer-flow", "ToolNode"], "alternates": {"html": "https://wpnews.pro/news/verified-detection-and-prevention-of-concurrency-anomalies-in-multi-agent-large", "markdown": "https://wpnews.pro/news/verified-detection-and-prevention-of-concurrency-anomalies-in-multi-agent-large.md", "text": "https://wpnews.pro/news/verified-detection-and-prevention-of-concurrency-anomalies-in-multi-agent-large.txt", "jsonld": "https://wpnews.pro/news/verified-detection-and-prevention-of-concurrency-anomalies-in-multi-agent-large.jsonld"}}