# 42/60 Days System Design Questions

> Source: <https://dev.to/thejoud1997/4260-days-system-design-questions-4018>
> Published: 2026-06-17 18:22:42+00:00

Your AI agent remembered the user's name.

Then it forgot what it was doing.

Here's the setup:

User asks the agent: book the cheapest flight to NYC, search hotels under $150/night, then compare total trip cost.

By step 3, the agent calls the LLM with 8,000 tokens of raw conversation history — and still answers as if it's turn 1.

You need a memory architecture before this ships. Which one do you pick?

A) In-context window only — full conversation stays in the system prompt. Simple. Breaks at ~15 turns or 8K tokens, whichever comes first.

B) Vector memory store — embed past turns, retrieve the top-k by semantic similarity at query time. Works great until "NYC flight" pulls a memory about a past NYC trip instead of the current task.

C) Episodic memory with summarization — compress old turns into structured event summaries, inject the relevant ones per request. More complex to build. Much harder to confuse.

D) Redis session state — structured key-value store, explicit agent reads/writes. Deterministic. Requires the agent to know what to store and when.

One of these collapses past 15 turns. One retrieves the wrong context at exactly the wrong moment. One is the right answer for task-oriented agents.

Pick A, B, C, or D — and tell me where you've hit this in production. Full breakdown in the comments.
