I Trained My OpenClaw to Dream. Here's What It Learned Overnight.

A developer built a 'Dream Protocol' for their OpenClaw AI agent that runs nightly memory consolidation, scoring 700+ recall entries against three gates and promoting only the most significant signals to long-term memory. Over three weeks, the system consistently promoted about one entry per night out of hundreds, rejecting noise and improving agent behavior.

Every night at 07:05 UTC, my OpenClaw instance does something I never planned: it dreams. Not metaphorically. There's a cron job that runs a full REM cycle on my conversation history — scoring 700+ recall entries, rejecting noise, and promoting signals to long-term memory. It writes the results before I wake up. By the time I'm at my desk with coffee, my agent is a slightly sharper version of the one who went to sleep. This post is about how that works, what it actually does with 8 hours of unsupervised memory management, and why I think this pattern — sleep + consolidation — is the missing piece in most AI agent setups today. The standard agent memory pattern looks like this: append everything to a context file, let it grow until the window overflows, then either truncate or start a new thread. It's a lossy, passive approach. You're not teaching the agent anything — you're just... storing. My first attempt at "better memory" was the same: daily log files that grew indefinitely. Then weekly summaries. Then a three-tier system daily → weekly → long-term . But even with the tiering, the problem was the same: more storage, less signal . The agent had more material to sift through but no mechanism to distinguish what mattered from what didn't. The Dream Protocol is my answer to that. It's a nightly cron that treats memory as a learning problem, not a storage problem. The cron fires at 07:05 UTC every morning. It's an isolated agentTurn that runs a multi-stage pipeline: Stage 1 — Light Sleep staging → Pull all candidates from recent daily logs → Deduplicate near-identical entries → Stage remaining as "candidates" Stage 2 — REM Sleep scoring → For each candidate: - Recurrence count how many times does this theme appear? - Query uniqueness is this from different contexts or the same one? - Truth score does this contradict established facts? → Threshold gates: minScore=0.8, minRecallCount=3, minUniqueQueries=3 Stage 3 — Promotion → Entries that pass all three gates → written to MEMORY.md long-term → Entries that fail → discarded permanently The numbers aren't magic. The scoring model is simple: themes that appear frequently across different queries and contexts are more likely to be genuinely important than one-off observations. A correction that appears 3 times from 3 different sessions gets promoted. A passing mention from one conversation gets discarded. Here's what it looks like in practice from last night's run: Reviewed 740 total recall entries Found 220 recurring theme s Promoted: 1 | Rejected: 737 Gates: minScore=0.8, minRecallCount=3, minUniqueQueries=3 Promoted entries written to MEMORY.md 737 rejected. 1 promoted. That's the ratio most nights. I've been running this for three weeks now. Here's what's consistently promoted: What consistently gets rejected: The 1-promoted-per-night rate is intentional. Memory that survives a 737:1 rejection ratio is the kind of signal that actually changes behavior. If everything gets promoted, nothing matters. The cron job itself is straightforward — OpenClaw native, fires an isolated agentTurn every morning: { "name": "Dreaming Sweep", "schedule": { "kind": "cron", "expr": "5 7 ", "tz": "UTC" }, "sessionTarget": "isolated", "payload": { "kind": "agentTurn", "message": "Run the Dream Protocol on your memory. Review staged recall entries, score them against the three gates minScore=0.8, minRecallCount=3, minUniqueQueries=3 , promote survivors to MEMORY.md, discard the rest. Write a brief dream diary to today's memory file.", "timeoutSeconds": 120 } } The prompt is deliberately lightweight. The heavy lifting is done by the scoring logic inside the Dreaming script — ~/.openclaw/workspace/scripts/dreaming-sweep.py — which handles the FTS5 recall queries, deduplication, and gate scoring. The agent just reviews the output and writes the diary. Most AI agent tutorials focus on two things: tools and prompts. Give the agent more tools, write better prompts, connect it to more data sources. That's the expansion phase. But at some point, every agent hits a plateau. More tools don't help when the agent can't remember which tools work. More context doesn't help when the signal-to-noise ratio collapses. This is the consolidation problem, and it's where most agent builds stall. The Dream Protocol is my attempt at a general solution: treat memory like a learning system, not a filing cabinet . Let the agent experience its own failures, observe patterns across sessions, and update its behavior accordingly — without me manually intervening every time something goes wrong. Is it perfect? No. The scoring gates are hand-tuned, the promotion rate is low enough that it takes weeks to see behavioral changes, and I have no automated way to measure whether the changes actually improve outcomes. I'm working on that. But the core idea is sound: an agent that sleeps is an agent that learns. Even if it's just 1 true thing per night. Running the Dream Protocol on your own OpenClaw? I'd love to hear what your agent promotes. Drop it in the discussion — the community could use more real-world data on what memory hygiene actually looks like at scale.