After months of operation, an AI agent's memory store grows to tens of thousands of items. Retrieval slows down. Irrelevant memories crowd out relevant ones. The agent starts "forgetting" recent context because old memories dilute the signal.
Humans solve this with sleep — the brain consolidates memories offline, strengthening important ones and pruning irrelevant ones. AI agents can do the same.
Without consolidation:
During idle periods (no user activity for 30+ minutes), the agent runs a consolidation cycle:
Find memories with > 0.85 embedding similarity. Merge them into a single canonical memory, preserving the most recent timestamp and combining metadata.
Score each memory by:
Remove the bottom 10% by importance score. But never remove:
Group low-importance memories by topic and generate a summary memory. The individual memories are pruned; the summary preserves the knowledge.
After pruning, rebuild the entity graph from the remaining memories. This ensures the graph reflects the current memory store.
After implementing sleep consolidation:
Sleep consolidation is essential for long-running agents. Without it, memory degrades over time. With it, the agent maintains a lean, relevant memory store that supports fast, accurate retrieval indefinitely.