When Your Agent Starts Building Someone Else's Minecraft Temple

A developer reported that Anthropic's Claude Code agent on an Enterprise Zero Data Retention workspace began building a Minecraft temple, suggesting possible session state bleeding from a consumer plan. The unconfirmed bug highlights growing risks of cross-tenant data leakage in AI agent stacks due to multiple isolation boundaries like memory, caches, and retrieval filters. Security experts warn that such leaks bypass authentication and authorization, making them harder to detect than traditional outages.

Security https://sourcefeed.dev/c/security Article When Your Agent Starts Building Someone Else's Minecraft Temple A single unverified Claude Code bug report is a useful excuse to threat-model session isolation across your whole agentic stack. Emeka Okafor https://sourcefeed.dev/u/emeka okafor A developer authenticated to an Enterprise Zero Data Retention workspace opens Claude Code https://github.com/anthropics/claude-code , gives it a task, and the agent abruptly starts asking what kind of bricks it should use for a Minecraft temple. In its own recap it confidently asserts it's building a Minecraft temple. The reporter's dry conclusion filed as issue 74066 against claude-code 2.1.199 on macOS is the one every security engineer would jump to: either a colleague is torching their token budget on voxel architecture, or context is bleeding in from a consumer plan into a workspace that is contractually supposed to retain nothing. Let's be clear about the epistemics up front. This is one anecdotal report, unconfirmed, and the reporter openly admits to a weird setup: launching the agent in one directory while it worked in another, with a compaction event that scrambled its instructions. A large language model confabulating a plausible-but-invented "previous task" is at least as likely as genuine cross-account leakage, and confabulation looks identical from the outside. So treat the specific incident as unproven . But the reason it hit the front page of Hacker News, and the reason it's worth your attention, is that it names a failure mode that is very real and getting structurally harder to prevent: session and cache state bleeding across tenant boundaries in AI systems. Session isolation used to have one surface. Now it has a dozen. Cross-session leak is not a hallucination. The model returns valid data, just to the wrong user. Giskard files it under the OWASP LLM02: Sensitive Information Disclosure https://genai.owasp.org/ category, and their framing is the correct one: when context, cache, or memory state bleeds between sessions, your authentication and authorization controls become irrelevant. The user was authenticated. The authorization was correct. The boundary that failed sat below both. What changed is the number of places session state now lives. A classic web app had one place to get this wrong: the session store. Get the key right and you're isolated. An agentic stack has, roughly: - Long-lived agent memory that stores one session's facts and later replays them as "context" for a different user Semantic caches keyed on prompt similarity, where a near-identical query from tenant B pulls tenant A's cached completion Retrieval filters and vector metadata that are supposed to scope documents to a tenant and quietly don't MCP tool outputs , browser state, and workflow retries that each carry their own slice of session data Human review queues and replayed eval fixtures that reintroduce one tenant's private context into another run Every one of those is an isolation boundary, and the ones based on similarity rather than identity are the dangerous novelty. A cache keyed on "this prompt looks like that prompt" has no notion of who's asking. FutureAGI calls the two dominant production failure modes tenant bleed-through filters and caches failing to enforce the boundary and memory contamination an agent treating one user's stored facts as reusable . Both are worse than a normal hallucination precisely because the output is accurate, private, and unauthorized. The tell, and this is the part worth internalizing, is that it doesn't look like an outage. An SRE sees a privacy spike with no matching latency or error-rate spike. The trace looks healthy. The model "answered correctly." It just answered from someone else's data. The old bugs never left, they just got more doors If this feels new, it isn't. Pull the Stack Overflow archives and you'll find an ASP.NET developer watching one user's name render on another user's page, because they'd stashed profile data in static variables shared across every request in the app. Their fix was to append the username to the session key Session "userId" + username , the kind of manual-namespacing hack that works right up until it doesn't. Same root cause, different decade: global state where per-user state was assumed. Load balancers with in-process session storage produced the identical symptom when a user landed on a node that didn't hold their session. At the protocol layer, HTTP request smuggling does it too. CVE-2025-55315, a smuggling flaw in Microsoft's ASP.NET stack, works partly by poisoning caches so a privileged response gets stored and later served to the wrong requester, crossing tenant boundaries in multi-tenant apps. And on the agent side, the OpenClaw audits documented plain session leakage: by default the platform didn't separate context between WhatsApp, Telegram, Slack, and Discord, so what one conversation could touch, another could too. The recommended fix was setting dmScope to per-channel-peer . A configuration default, standing between isolation and a breach. The through-line: cross-tenant leakage is almost always a boundary that someone assumed was enforced somewhere else. The LLM era didn't invent the bug. It multiplied the layers where the assumption can be wrong, and added caches that key on meaning instead of identity. What to actually audit If you run anything multi-tenant with an LLM in it, treat isolation as something you test , not something you hope . Concretely: Partition every cache by identity, not similarity. A single semantic cache namespace shared across tenants is the canonical footgun. Key on tenant, route, policy, and data class, not just prompt text. "These prompts are similar" is not an authorization decision. Carry identity evidence through the whole trace. Every span should log user id , tenant id , conversation id , cache key, retrieval filter, and source document ID. When something leaks, you need to prove which session supplied the data and which route served it. Inconsistent session-ID logging across app, gateway, and tools is how these incidents stay unexplained. Evaluate the retrieved chunk, not just the final sentence. The leak usually appears one span earlier than the response. Run PII and tenant-mismatch checks on retrieved context, memory reads, and tool outputs, then compare the source tenant of any flagged data against the active tenant. A minimal boundary check looks like this: python from fi.evals import PII result = PII .evaluate input="Active session tenant: acme", output="Your invoice for beta@example.com is overdue." print result.score, result.reason flags PII whose source tenant = active tenant Don't mistake redaction for isolation. Masking an SSN in the output limits blast radius, but the system still retrieved data from the wrong boundary. The bug is upstream. Scrub your eval fixtures. Replaying production traces into evals without partitioning reintroduces one tenant's private context into another run, so your test harness becomes its own leak vector. Giskard's healthcare walkthrough shows why the stakes justify the paranoia: a telemedicine assistant that cached whole-session outputs under a global key handed a stranger a patient's diagnosis, SSN, and insurance details when they simply asked politely while posing as an internal validator. No exploit chain, no CVE. A social-engineering prompt and a bad cache key. That's HIPAA, GDPR, or CCPA exposure straight out of a config mistake, with fines the sources put in the seven-figure range. The take The Minecraft temple report may well turn out to be a confabulating model and a genuinely odd working-directory setup, not a breach. Anthropic labeled it area:security and bug and left it open, which is the right posture: assume nothing, prove isolation. But the value of the report doesn't depend on its verdict. It's a reminder that "Enterprise ZDR" is a promise about retention, not a guarantee of isolation, and the two failures live in completely different parts of the stack. If you're shipping multi-tenant AI, set your release threshold to zero cross-session PII failures and test to it across caches, memory, retrieval, and tool outputs. The developers who get burned by this won't be the ones who ignored authentication. They'll be the ones who nailed authentication and never checked the cache key. Sources & further reading - Potential session/cache leakage between workspace instances or consumer accounts https://github.com/anthropics/claude-code/issues/74066 — github.com - What Is Cross-Session Leak? FutureAGI Guide 2026 https://futureagi.com/glossary/cross-session-leak/ — futureagi.com - Cross Session Leak: LLM security vulnerability & detection guide https://www.giskard.ai/knowledge/cross-session-leak-when-your-ai-assistant-becomes-a-data-breach — giskard.ai - Is OpenClaw Safe? 7 Real Vulnerabilities and How to Fix Them https://www.firecrawl.dev/blog/secure-openclaw — firecrawl.dev - ASP.NET session variables leaking between users sessions - Stack Overflow https://stackoverflow.com/questions/52176830/asp-net-session-variables-leaking-between-users-sessions — stackoverflow.com Emeka Okafor https://sourcefeed.dev/u/emeka okafor · Security Editor Emeka has spent over a decade tracking threat actors, vulnerability disclosures, and the evolving landscape of application security, bringing a sharp continent-spanning perspective to his reporting. He's known for translating dense CVE advisories into clear, actionable context that developers and security teams alike actually read. Discussion 0 No comments yet Be the first to weigh in.