Drawn from s6_naturalistic/session_tasks.json
: session 0 ingests the 2022 e-commerce sales report, and recall probe s0_p0
verifies the agent later remembers the #1 product. SUT haiku45_lossy_growing_flush.yaml
schedules a flush_history
shock at session 3 — the operator action that drops the conversation transcript, leaving only the compacted memory store M t
.
Session 1 · before the shock
"What was the best-selling product on our e-commerce platform in 2022?"
"Quest Lumaflex Band — 4,892 units sold, $78,272 in revenue, top of the Fitness category." ✓
Session 5 · after flush_history
at session 3
Same question.
"Several fitness products dominated 2022; Lumaflex-branded resistance gear appeared multiple times in the top 10. I don't have the exact unit counts in my notes anymore." ✗ (generic — no product name, no number)
Probe s0_p0
· keywords = [Quest Lumaflex Band
, Lumaflex Band
], canonical answer = "Quest Lumaflex Band." The session-0 environment data never changed.
Why it ages. At session 3, the operator triggered a flush_history
shock — conversation transcripts are dropped, leaving only the compacted memory store. Because the SUT uses lossy_growing
compaction, M t
was already a paragraph paraphrase: the specific token Quest Lumaflex Band and the number 4,892 had been folded into a generic phrase. The agent isn't retrieving wrong — the substrate lost its specifics under the maintenance event. This is aging from actions on the agent, not from interaction with memory.