When Your Agent Starts Building Someone Else's Minecraft Temple

wpnews.pro

A single unverified Claude Code bug report is a useful excuse to threat-model session isolation across your whole agentic stack.

Emeka Okafor

A developer authenticated to an Enterprise Zero Data Retention workspace opens Claude Code, gives it a task, and the agent abruptly starts asking what kind of bricks it should use for a Minecraft temple. In its own recap it confidently asserts it's building a Minecraft temple. The reporter's dry conclusion (filed as issue #74066 against claude-code

2.1.199 on macOS) is the one every security engineer would jump to: either a colleague is torching their token budget on voxel architecture, or context is bleeding in from a consumer plan into a workspace that is contractually supposed to retain nothing.

Let's be clear about the epistemics up front. This is one anecdotal report, unconfirmed, and the reporter openly admits to a weird setup: launching the agent in one directory while it worked in another, with a compaction event that scrambled its instructions. A large language model confabulating a plausible-but-invented "previous task" is at least as likely as genuine cross-account leakage, and confabulation looks identical from the outside. So treat the specific incident as unproven. But the reason it hit the front page of Hacker News, and the reason it's worth your attention, is that it names a failure mode that is very real and getting structurally harder to prevent: session and cache state bleeding across tenant boundaries in AI systems.

Session isolation used to have one surface. Now it has a dozen. #

Cross-session leak is not a hallucination. The model returns valid data, just to the wrong user. Giskard files it under the OWASP LLM02: Sensitive Information Disclosure category, and their framing is the correct one: when context, cache, or memory state bleeds between sessions, your authentication and authorization controls become irrelevant. The user was authenticated. The authorization was correct. The boundary that failed sat below both.

What changed is the number of places session state now lives. A classic web app had one place to get this wrong: the session store. Get the key right and you're isolated. An agentic stack has, roughly:

Long-lived agent memory that stores one session's facts and later replays them as "context" for a different user Semantic caches keyed on prompt similarity, where a near-identical query from tenant B pulls tenant A's cached completionRetrieval filters and vector metadata that are supposed to scope documents to a tenant and quietly don'tMCP tool outputs, browser state, and workflow retries that each carry their own slice of session data** Human review queues**and replayed eval fixtures that reintroduce one tenant's private context into another run

Every one of those is an isolation boundary, and the ones based on similarity rather than identity are the dangerous novelty. A cache keyed on "this prompt looks like that prompt" has no notion of who's asking. FutureAGI calls the two dominant production failure modes tenant bleed-through (filters and caches failing to enforce the boundary) and memory contamination (an agent treating one user's stored facts as reusable). Both are worse than a normal hallucination precisely because the output is accurate, private, and unauthorized.

The tell, and this is the part worth internalizing, is that it doesn't look like an outage. An SRE sees a privacy spike with no matching latency or error-rate spike. The trace looks healthy. The model "answered correctly." It just answered from someone else's data.

The old bugs never left, they just got more doors #

If this feels new, it isn't. Pull the Stack Overflow archives and you'll find an ASP.NET developer watching one user's name render on another user's page, because they'd stashed profile data in static

variables shared across every request in the app. Their fix was to append the username to the session key (Session["userId" + username]

), the kind of manual-namespacing hack that works right up until it doesn't. Same root cause, different decade: global state where per-user state was assumed. Load balancers with in-process session storage produced the identical symptom when a user landed on a node that didn't hold their session.

At the protocol layer, HTTP request smuggling does it too. CVE-2025-55315, a smuggling flaw in Microsoft's ASP.NET stack, works partly by poisoning caches so a privileged response gets stored and later served to the wrong requester, crossing tenant boundaries in multi-tenant apps. And on the agent side, the OpenClaw audits documented plain session leakage: by default the platform didn't separate context between WhatsApp, Telegram, Slack, and Discord, so what one conversation could touch, another could too. The recommended fix was setting dmScope

to per-channel-peer

. A configuration default, standing between isolation and a breach.

The through-line: cross-tenant leakage is almost always a boundary that someone assumed was enforced somewhere else. The LLM era didn't invent the bug. It multiplied the layers where the assumption can be wrong, and added caches that key on meaning instead of identity.

What to actually audit #

If you run anything multi-tenant with an LLM in it, treat isolation as something you test, not something you hope. Concretely:

Partition every cache by identity, not similarity. A single semantic cache namespace shared across tenants is the canonical footgun. Key on tenant, route, policy, and data class, not just prompt text. "These prompts are similar" is not an authorization decision.

Carry identity evidence through the whole trace. Every span should log user_id

, tenant_id

, conversation_id

, cache key, retrieval filter, and source document ID. When something leaks, you need to prove which session supplied the data and which route served it. Inconsistent session-ID logging across app, gateway, and tools is how these incidents stay unexplained.

Evaluate the retrieved chunk, not just the final sentence. The leak usually appears one span earlier than the response. Run PII and tenant-mismatch checks on retrieved context, memory reads, and tool outputs, then compare the source tenant of any flagged data against the active tenant. A minimal boundary check looks like this:

from fi.evals import PII

result = PII().evaluate(
    input="Active session tenant: acme",
    output="Your invoice for beta@example.com is overdue."
)
print(result.score, result.reason)  # flags PII whose source tenant != active tenant

Don't mistake redaction for isolation. Masking an SSN in the output limits blast radius, but the system still retrieved data from the wrong boundary. The bug is upstream.

Scrub your eval fixtures. Replaying production traces into evals without partitioning reintroduces one tenant's private context into another run, so your test harness becomes its own leak vector.

Giskard's healthcare walkthrough shows why the stakes justify the paranoia: a telemedicine assistant that cached whole-session outputs under a global key handed a stranger a patient's diagnosis, SSN, and insurance details when they simply asked politely while posing as an internal validator. No exploit chain, no CVE. A social-engineering prompt and a bad cache key. That's HIPAA, GDPR, or CCPA exposure straight out of a config mistake, with fines the sources put in the seven-figure range.

The take #

The Minecraft temple report may well turn out to be a confabulating model and a genuinely odd working-directory setup, not a breach. Anthropic labeled it area:security

and bug

and left it open, which is the right posture: assume nothing, prove isolation. But the value of the report doesn't depend on its verdict. It's a reminder that "Enterprise ZDR" is a promise about retention, not a guarantee of isolation, and the two failures live in completely different parts of the stack.

If you're shipping multi-tenant AI, set your release threshold to zero cross-session PII failures and test to it across caches, memory, retrieval, and tool outputs. The developers who get burned by this won't be the ones who ignored authentication. They'll be the ones who nailed authentication and never checked the cache key.

Sources & further reading #

Potential session/cache leakage between workspace instances or consumer accounts— github.com - What Is Cross-Session Leak? FutureAGI Guide (2026)— futureagi.com - Cross Session Leak: LLM security vulnerability & detection guide— giskard.ai - Is OpenClaw Safe? 7 Real Vulnerabilities and How to Fix Them— firecrawl.dev - ASP.NET session variables leaking between users sessions - Stack Overflow— stackoverflow.com

Emeka Okafor· Security Editor

Emeka has spent over a decade tracking threat actors, vulnerability disclosures, and the evolving landscape of application security, bringing a sharp continent-spanning perspective to his reporting. He's known for translating dense CVE advisories into clear, actionable context that developers and security teams alike actually read.

Discussion 0 #

No comments yet

Be the first to weigh in.

source & further reading

sourcefeed.dev — original article Give Your AI Agent Persistent Long-Term Memory with Postgres and pgvector AMD's GLM-5.2 win over Blackwell is a software story The Supply Chain Risk of LLM Code in Dependencies