I stopped treating AI memory as summaries. I now think in handoffs.

wpnews.pro

In my last post, I wrote that AI memory should be treated as product state, not as a prompt trick.

After publishing that post, I kept running into the next problem while working on the memory layer:

What should actually be carried into the next session?

That is a different question from what should be stored.

A summary compresses the past. A handoff prepares the next interaction.

That difference matters.

A lot of memory systems start from summarization.

At the end of a session, compress the conversation. Save the important parts. Use them later.

That sounds reasonable, but the word "summary" hides a product problem.

A summary does not have a clear reader.

It tries to compress everything for nobody in particular. It often becomes a smaller version of the conversation, with some details removed and some confidence added. But it does not answer the question that matters:

Who is this note for?

If the answer is "the next session," then the artifact changes.

It is no longer just a summary. It is a handoff.

A handoff artifact is written for a future interaction that has to pick up the work without pretending it was there the whole time.

That constraint changes what is worth writing down.

For reflective AI, I do not think the best memory is always the most detailed memory.

The goal is not to preserve everything. The goal is to preserve the few things that would make the next session more respectful, more accurate, or less repetitive.

I currently think there are at least four classes worth carrying forward.

Sometimes the important thing is not a fact about the user.

It is a shift in how the user understands the problem.

Maybe they came in thinking they needed a feature, and left realizing they needed a boundary. Maybe they started with a vague feeling and ended with a clearer question. Maybe the session changed the shape of the work.

That kind of moment can be worth carrying forward because it helps the next session avoid dragging the user back to an older frame.

But it has to be handled carefully.

A framing shift is not a permanent identity claim. It should not become "the user is this kind of person." It is closer to:

In this context, the user moved from this frame to that frame.

That is a safer memory.

Boundaries are different from preferences.

A preference might be:

I like shorter replies.

A boundary is more like:

Do not make this decision for me.

Or:

Do not reopen this unless I ask.

Or:

This topic needs explicit consent.

These should not be buried inside a general summary. They deserve their own status.

If a user sets a boundary, the next session should not have to rediscover it through trial and error.

This is where memory starts to look less like personalization and more like trust infrastructure.

Not every useful session ends with a conclusion.

Sometimes the user chooses to keep something unresolved.

That can be an important state.

The system should not rush to resolve it later just because it has a saved note. It should also not erase it and force the user to rebuild the context from scratch.

A good handoff artifact can say:

This was left open on purpose.

That is different from saying:

This is unfinished.

This one feels especially important.

When the user corrects the system, that is not just a preference signal. It is a repair event.

The system got something wrong, and the user supplied the fix.

That should be one of the most valuable things to carry forward, because it is testable.

A framing-change moment can be hard to verify in the next session. A correction is easier to test. When a similar input appears again, the system can show whether it actually learned by not making the same mistake.

This is also a useful guardrail against vague personalization.

The system should not only remember a polished version of who it thinks the user is. It should remember where its own model of the user failed.

The practical difference between a summary and a handoff is that a handoff needs a schema.

A loose summary can be a paragraph.

A handoff artifact needs to make the next reader explicit.

A simplified version might look like this:

type HandoffArtifact = {
  id: string;
  sessionId: string;
  audience: "next-session" | "retrieval" | "user-review";

  carryForward: HandoffItem[];
  doNotCarryForward: HandoffItem[];
  unresolvedTensions: HandoffItem[];
  userCorrections: HandoffItem[];

  consentState: "draft" | "approved" | "edited" | "rejected";
  createdAt: string;
  expiresAt?: string;
};

type HandoffItem = {
  kind:
    | "framing-change"
    | "boundary"
    | "unresolved-tension"
    | "correction";

  text: string;
  sourceExcerpt?: string;
  reason: string;
  confidence: "low" | "medium" | "high";
};

The important part is not the exact schema.

The important part is that the artifact separates several things that summaries usually flatten:

This gives the next session a cleaner contract.

It also gives the product a better failure mode.

If the system gets the handoff wrong, the user can reject or edit a specific item instead of fighting an invisible memory layer.

A basic lifecycle might look like this:

session
-> draft handoff
-> user review
-> approved / edited / rejected
-> retrieval candidate
-> prompt context

That lifecycle is the part I would not skip.

Without it, the schema is just another hidden memory object. The user still has no real authority over what gets inherited.

This may be the hardest product question.

If the model writes the handoff artifact by itself, it may quietly preserve the version of the user it finds easiest to understand.

That is dangerous.

The model might turn a temporary mood into a stable trait. It might compress a complicated tension into a simple label. It might save the cleanest story, not the truest one.

If the user has to write the handoff artifact, the cost is too high.

Most users will not do it. And honestly, they should not have to. Asking users to manage their own memory system like a database is not a real product solution.

The middle ground seems more promising:

The model drafts. The user ratifies.

But that only works if ratification is light.

It cannot feel like reviewing a document after every session. It probably has to be closer to:

The user should not have to become the system administrator of their own memory. But they should have real authority over what gets inherited.

This part is under-discussed.

Most memory products assume continuity is good.

More continuity, more personalization, more context.

But unasked-for continuity can create a burden for the user.

If the system carries forward the wrong thing, the user has to spend the next session undoing it. They have to refute an inheritance they never asked for.

That is backwards.

Sometimes carrying nothing forward is the more respectful choice.

It means the next session has to earn the context again. The user does not have to dismantle a stale version of themselves before the conversation can begin.

For reflective AI, this matters a lot.

A system that remembers too aggressively can start to sound intimate before it has earned the right to be. It can act as if it understands the user more than it does.

That is not continuity. That is overconfidence.

I have been thinking about this while building Jung Room, a non-clinical AI self-exploration room for dreams, moods, symbols, and recurring patterns.

In that kind of product, memory cannot just be a hidden personalization layer.

It has to be something the user can inspect, correct, and choose not to carry forward.

That changes the implementation.

Memory is no longer just a retrieval feature. It becomes part of the product contract.

The more I think about this, the less I see AI memory as a storage problem.

Storage is the easy word.

The harder questions are governance questions.

Who decides what gets remembered?

Who decides what gets used?

Who can correct it?

Who can delete it?

When should the system choose not to remember?

And when memory moves from one session to the next, who is responsible for that inheritance?

A handoff artifact makes these questions more visible because it names the next reader.

It forces the system to ask what the next session actually needs, instead of simply compressing the last one.

That is the product shift I care about:

AI memory should not be a pile of summaries. It should be a set of governed handoffs, with the user still in charge of what gets carried forward.

source & further reading

dev.to — original article Building Effective Prompts for AI Code Review: What Actually Works LOOP ENGINEERING: TECHNICAL BLUEPRINT Let your agent negotiate meeting times over email

I stopped treating AI memory as summaries. I now think in handoffs.

Run your AI side-project on zahid.host