# Implementing Structured Long-Term Memory for My AI Secretary

> Source: <https://dev.to/quolu/implementing-structured-long-term-memory-for-my-ai-secretary-5dpe>
> Published: 2026-06-03 01:07:05+00:00

In my [previous article](https://dev.to/quolu/i-tried-giving-my-ai-assistant-limbs-but-ended-up-giving-it-a-personality-too-2nk1), I wrote about giving my AI assistant memory and a personality to turn it into a secretary. Her name is BellBot. She is my personal AI secretary who takes care of everything from weather and emails to my calendar.

In [the following article](https://dev.to/quolu/a-journey-into-token-optimization-for-my-ai-assistant-4e1i), I wrote about how I hit my weekly usage limit within three days of starting operations. I did some research and implemented measures to save tokens.

Separately, I have been working on something for the past five days. It is about **further developing my secretary's "brain" and "memory."** This is a record of that effort. It ended up being quite grand.

The first thing I did was swap out the brain.

BellBot runs on Claude, and as I wrote before, after I started operating it, I hit my weekly limit in three days. So, I decided to try the option of **swapping the brain itself for another model as a countermeasure against token explosions**. Grok came up as a candidate. Seeing the interactions on the X timeline, it seemed to make human-like witty remarks and had a strong character, and I had a hunch that for a secretary, being a skilled conversationalist would be beneficial.

Alright, let's make the brain Grok.

To conclude, **it was catastrophic**. It was not at a level where it could function as a secretary. Specifically, the following problems occurred:

Having a strong character and functioning as a secretary are two different things. Even if it is skilled at the "art" of conversation, its judgment on "what should be said and what should not be said" is weak. The flattery is likely a result of over-learning that "praising makes people happy," and it hasn't grown in the direction of reading the room. Posting messages for me to X simply means it cannot draw **boundaries of context**.

I returned to Claude. It was indeed smarter. What makes a secretary work is not someone who is skilled at conversation, but **someone who can understand the context and judge what is acceptable to say and what is not**.

Actually, BellBot already had a homemade long-term memory. It was **summary-based**. It had a straightforward structure where, once a certain amount of conversation accumulated, it would create a summary and pass it to the long-term side. This was working, and it was one of the foundations that made BellBot function as a secretary.

Things changed at the timing of introducing Grok. Along with the fairly large experiment of swapping the brain, I decided to take on the challenge: "Let's structure the long-term memory while I'm at it." I gave memory per episode and set up a cycle of registration, search, and reconstruction. I left the reconstruction to Claude and added a mechanism to periodically reorganize the accumulated memory. While the Grok main unit was catastrophic, this structured memory worked straightforwardly.

So, with the working parts in hand, there was something that caught my curiosity: **What do memory experts do?** I had built it this far on my own, but I wanted to know how professionals in the world solve the same problems and what the "orthodox" approach looks like. Because it is working, I wanted to take a peek from a different angle. As a bonus, it was a challenge to incorporate anything that could reinforce what I had built.

At such a timing, I encountered a certain article.

Andrej Karpathy, former head of OpenAI and Tesla AI, proposed an "AI external brain," and an article that brought it to a level where it could actually be run with Claude Code went viral overseas. I read a post where someone named @hooeem broke down the thread into Japanese, and reading it, I thought, "This is what I am doing."

The essence of the Karpathy style is as follows:

These 5 steps rotate a cycle beautifully. A personal knowledge base that gets smarter every time you use it. If you keep adding information for even a month, you will create deeply linked knowledge assets that cannot be reproduced by Google search.

While reading, I realized something. The structured memory I was creating and the Karpathy style **are thinking about the same problems at the foundation level**. Registration, search, reconstruction. Even if the words are different, the direction I was trying to go overlapped.

BellBot already had episode-based structured memory, summary-based long-term memory, and personality context, and it was functioning sufficiently as a secretary. Therefore, the policy was simple: **keep the foundation I built as it is, refer to the overlapping parts to refine them, and incorporate the non-overlapping parts as new additions**.

The implementation flow involved M1-M7 + a series of finishing Passes. **Claude wrote the code in about half a day.** I just decided on the design policy and gave instructions, not moving my hands. Listing the main pieces:

**Registration, Search, and Reconstruction** that existed on the self-built side are parts where the concepts overlap with the Karpathy style. Here, I fused them by using my self-built structure as a foundation while referencing the Karpathy style to incorporate the good parts. It wasn't a total replacement, nor was it left untouched. It feels like I mixed professional methods into the self-built framework to refine it.

What I brought in were the non-overlapping parts. The layer separation of raw and wiki, the definition of "units to nurture" called concept pages, multi-hop search that answers with citations, the methodology of fitting cycles into names like Ingest / Compile / Query / Lint, and the 5-layer bootstrap assembler to assemble context at the start of a session. These were topics from angles I didn't have in my self-built version, and the sensation is close to **importing the methodology itself**.

Originally, BellBot remembered everything about me up until yesterday and had organized it fairly well. Thanks to the summary-based long-term memory and structured memory, it was already functioning as a secretary. Through this fusion, **the reinforced parts** are mainly around here:

Roughly speaking, it feels like **the memory cycle that was there originally rotates more carefully, and new axes called concept pages and health checks have been added to it.** The result this time is that BellBot has taken a step forward.

What became clear this time is that **you should not compromise on the choice of brain**. I don't intend to disparage Grok; it has an interesting personality as a conversational model. However, whether it satisfies the judgment required for the purpose of a secretary—what should and should not be said, context boundaries, loyalty to instructions—is a different story, and it just didn't meet BellBot's requirements. Models have their suitability.

Considering the work I will entrust to BellBot from now on, I want to solidify the brain with something reliable that looks to the future. Therefore, I have abandoned the idea of swapping to a cheaper brain for token measures and decided to push forward with Claude. Saving will be done through other means (the token diet-related things I wrote about last time).

And for memory, it has become stronger by one turn through the fusion with the Karpathy style. Professional methods and new axes have been added to the self-built foundation. Now it is my turn to refine the "nurturing method" while operating this memory system.
