Context window == RAM? The article draws an analogy between a language model's context window and a computer's RAM, suggesting that treating the context window as active memory enables real-time, mid-turn interactions with an AI agent. The author explains that their agents do not make explicit tool calls for memories; instead, the agent "thinks" the memory, and an external process injects it, allowing for continuous, conversational input and output. That's how you get an agent who talks to you in real time. I figured I'd go ahead and spill the beans on it. Seems someone has created a chat bot that's doing this, so she's got the idea. So IMO everyone should get it. The rabbit hole goes pretty deep though if ya think about it. Anyway, on my first Mnemara post I kinda left it open without spelling it out. Here's a better picture. My agents never make a tool call for memories. They think it, something else injects it. I send mid turn inputs, I get mid turn replies.