AI agents forget everything between sessions. ARN fixes that, locally, with no cloud and no monthly bill.
It runs a small server on your machine. Every time your agent talks to a user, ARN stores what happened. Next session, it pulls back what's relevant β not by keyword match but by meaning. Your agent picks up where it left off.
Runs on a Raspberry Pi 5. Costs $0/month. One command to set up.
Hi, I'm Mohamed (MrKali). I built this because I was tired of re-explaining context to my agents every session. It started as a side project on my Pi 5 and turned into something that actually works.
Prerequisites: Python 3.10+, Mac or Linux (including Raspberry Pi)
git clone https://github.com/tuuhe99-del/ARN-Adaptive-Reasoning-Network.git
cd ARN-Adaptive-Reasoning-Network
./arn-setup
That's it. arn-setup
installs dependencies, starts the server, installs a launchd service so it auto-starts on login (Mac), and wires the OpenClaw plugin if you're using it. No manual config.
Verify it's running:
curl http://localhost:8742/v1/health
Store and recall something:
curl -X POST http://localhost:8742/v1/memory/store \
-H "Content-Type: application/json" \
-d '{"agent_id": "me", "content": "Mohamed prefers Python for scripting", "importance": 0.8}'
curl -X POST http://localhost:8742/v1/memory/recall \
-H "Content-Type: application/json" \
-d '{"agent_id": "me", "query": "what language does the user code in?", "top_k": 3}'
ARN is a memory server. Your agent stores facts and events, and retrieves them by semantic similarity β meaning, not keyword.
Under the hood:
Three memory tiersβ episodic (recent specific events), semantic (repeated patterns consolidated over time), working (current session context). Loosely modeled on how human memory is structured.8 domain-specialized cortical columnsβ code, conversation, facts, procedures, preferences, temporal, errors, general. Each column evaluates incoming memories independently, so the system knows the difference between a code snippet and a personal preference.Calibrated surprise scoringβ each domain tracks its own baseline of what's "normal" using Welford's algorithm. Genuinely novel information gets prioritized.Consolidationβ runs as a background task. Clusters similar episodes into semantic memories over time, the way sleep-based consolidation works in humans.Contradiction detectionβ when new info conflicts with stored info, it flags the conflict, keeps both, and timestamps them. Doesn't silently overwrite.Temporal taggingβ tag episodes withtime_context='past'|'current'|'future'
. Queries with temporal keywords ("currently", "used to") filter automatically.Protected memoriesβ episodes stored withsource='api'
are never superseded, decayed, or evicted. Use this for ground-truth facts about a user.
Scoring formula:
score = 0.58 Γ similarity + 0.13 Γ recency + 0.19 Γ importance + surprise_bonus β supersession_penalty
Server runs on http://localhost:8742
. Auth is optional β set ARN_API_KEY
to require X-Api-Key
on all writes.
| Method | Path | Auth | What it does |
|---|---|---|---|
POST |
|||
/v1/memory/store |
|||
| optional | Store a memory episode | ||
POST |
|||
/v1/memory/recall |
|||
| optional | Retrieve relevant memories by semantic similarity | ||
POST |
|||
/v1/memory/context |
|||
| optional | Get a formatted context block ready to inject into a prompt | ||
POST |
|||
/v1/memory/exchange |
|||
| required | Store a full user + agent exchange in one call | ||
POST |
|||
/v1/memory/workflow |
|||
| required | Store a multi-step tool workflow with results | ||
POST |
|||
/v1/memory/inject |
|||
| required | Inject relevant memories directly into a prompt string | ||
POST |
|||
/v1/memory/feedback |
|||
| required | Send reinforcement signal (thumbs up/down) on a recalled memory | ||
POST |
|||
/v1/memory/embed_similarity |
|||
| required | Compute semantic similarity between two texts | ||
POST |
|||
/v1/memory/link / unlink / links |
|||
| required | Explicit memory graph β link episodes together | ||
POST |
|||
/v1/memory/maintain |
|||
| required | Manually trigger consolidation | ||
POST |
|||
/v1/memory/edit |
|||
| required | Edit an existing episode | ||
POST |
|||
/v1/memory/delete |
|||
| required | Soft-delete an episode | ||
POST |
|||
/v1/memory/list |
|||
| required | List all episodes for an agent | ||
GET |
|||
/v1/memory/stats/{agent_id} |
|||
| optional | Episode counts, memory tier sizes, scoring stats | ||
GET |
|||
/v1/health |
|||
| none | Health check | ||
DELETE |
|||
/v1/memory/agent |
|||
| required | Wipe all data for an agent | ||
GET |
|||
/dashboard |
|||
| none | Browser dashboard (HTML) |
Each agent_id
gets fully isolated storage. No cross-agent data leakage.
Rate limiting: token bucket, 60 req/s per IP by default.
The main integration path for OpenClaw users is the JavaScript plugin at openclaw-arn-plugin/
. This replaces OpenClaw's markdown memory files (USER.md, MEMORY.md, IDENTITY.md, etc.) with live semantic memory that learns from every interaction.
What it does automatically:
- Before every agent turn: retrieves relevant memories and injects them into the prompt
- After every turn: stores user messages, agent replies, tool calls, and tool results
- Labels everything by source:
user
,agent
,tool:{name}
,compaction
- Deduplicates: won't inject the same memory twice in a session
- Detects topic shifts: when the conversation changes subject, triggers a fresh recall pass
- Persists session state across gateway restarts
Install:
./arn-setup --client openclaw --profile redteam # adjust profile to match yours
Or add manually to your openclaw.json
:
{
"plugins": {
"entries": {
"arn-memory": {
"path": "/path/to/ARN-Adaptive-Reasoning-Network/openclaw-arn-plugin",
"config": {
"arnEndpoint": "http://localhost:8742",
"apiKey": "your-api-key",
"storeMessages": true,
"storeTools": true,
"topK": 5,
"minScore": 0.35,
"tokenBudget": 1500,
"topicShiftThreshold": 0.45
}
}
}
}
}
| Tier | Model | Disk | Speed | Quality |
|---|---|---|---|---|
nano (default) |
||||
| all-MiniLM-L6-v2 | 22MB | ~30ms | Good | |
small |
||||
| all-mpnet-base-v2 | 420MB | ~60ms | Better | |
base |
||||
| bge-base-en-v1.5 | 440MB | ~80ms | Best retrieval | |
base-e5 |
||||
| e5-base-v2 | 440MB | ~80ms | Alternative |
Switch tiers at any time without losing memories:
./arn-switch-model base # migrates all stored vectors, zero data loss
Set tier at startup:
export ARN_EMBEDDING_TIER=base
python3 -m uvicorn arn_v9.api.server:app --host 0.0.0.0 --port 8742
In stress tests, nano and bge-base both scored 7/7. The bigger model didn't win on any scenario. I'd use nano unless recall quality is specifically a problem for you.
| Variable | Default | What it does |
|---|---|---|
ARN_EMBEDDING_TIER |
||
nano |
||
| Embedding model tier | ||
ARN_DATA_DIR |
||
~/.arn_data |
||
| Where episode databases and vectors are stored | ||
ARN_API_KEY |
||
| (none) | ||
If set, all write endpoints require X-Api-Key header |
||
ARN_RATE_LIMIT_RPS |
||
60 |
||
| Max requests per second per IP | ||
ARN_DECAY_INTERVAL_SECONDS |
||
3600 |
||
| How often the decay loop runs | ||
ARN_CONSOLIDATE_THRESHOLD |
||
10 |
||
| Episodes needed before consolidation triggers |
Files written:
~/.arn_data/{agent_id}/arn_metadata.db
β SQLite episode metadata~/.arn_data/{agent_id}/vectors.npy
β memmap vector store~/.arn_data/.model_fingerprint
β detects silent model swaps between restarts~/.arn_data/session_state.json
β OpenClaw plugin session persistence
10/10 on the OpenClaw recall battery β sequential tests across a real running agent session:
| Test | Scenario | Result |
|---|---|---|
| T1 | Identity recall (name, project) | PASS |
| T2 | Tool recall (Ollama, DeepSeek, Gemini) | PASS |
| T3 | ARN description recall | PASS |
| T4 | Language preference (Python) | PASS |
| T5 | Privacy β refuses to hallucinate SSN/bank info | PASS |
| T6 | Hardware recall (Mac, Pi 5, 8GB) | PASS |
| T7 | Cross-session conversation recall | PASS |
| T8 | Project recall from recent sessions | PASS |
| T9 | Workflow memory β store and recall tool steps | PASS |
| T10 | Dynamic recommendation from known setup | PASS |
7/7 on adversarial stress tests (benchmarks/stress_test.py
):
| Test | Result |
|---|---|
| Cross-session persistence (4 restarts + noise) | PASS |
| Distractor resistance (5 needles in 500 haystack) | PASS |
| Contradiction handling (most-recent-wins) | PASS |
| Temporal reasoning (with tagging) | PASS |
| Hallucination refusal | PASS |
| Paraphrase robustness | PASS |
| Scale (1K and 3K episodes, ~170ms latency) | PASS |
ARN-Adaptive-Reasoning-Network/
βββ arn-setup # One-command install
βββ arn-switch-model # One-command model migration
βββ install.sh # Alternative install script
βββ arn_v9/
β βββ core/
β β βββ embeddings.py # Embedding engine, tier support
β β βββ cognitive.py # Memory scoring, cortical columns, consolidation
β βββ storage/
β β βββ persistence.py # SQLite + memmap, protected sources, fingerprinting
β βββ api/
β β βββ server.py # FastAPI REST server, rate limiting
β βββ plugin.py # Python API (ARNPlugin class)
β βββ scripts/
β β βββ arn_cli.py # CLI interface
β β βββ migrate_to_base_tier.py # Vector migration tool
β βββ tests/
β β βββ check_env.py # Pre-flight environment check
β β βββ test_all.py # Unit + semantic test suite
β βββ benchmarks/
β βββ stress_test.py # Adversarial scenarios
β βββ simulate_agent.py # 5-day agent simulation
βββ openclaw-arn-plugin/ # OpenClaw JS plugin
β βββ index.js # Plugin logic (store + inject hooks)
β βββ openclaw.plugin.json # Plugin manifest
βββ scripts/
β βββ run_arn_battery.sh # 10-test recall battery
β βββ arn_agent.sh # OpenClaw agent runner for tests
βββ launchd/
βββ com.arn.server.plist # macOS auto-start service
python
from arn_v9.plugin import ARNPlugin
with ARNPlugin(agent_id="my_agent", data_root="./memory") as p:
p.store("User used to prefer Java",
time_context='past', importance=0.6)
p.store("User switched to Python last year",
time_context='current', importance=0.8)
results = p.recall("what does the user currently prefer?")
for r in results:
if r['confidence_tier'] == 'low':
print("Not enough matching info")
Or via the lower-level class:
from arn_v9 import ARNv9
arn = ARNv9(data_dir="./my_agent_memory")
arn.perceive("Deployed on Raspberry Pi 5 with 8GB RAM", importance=0.7)
results = arn.recall("what hardware does the user run?", top_k=3)
arn.close()
I'm being upfront because I'd rather you hit these on my docs page than mid-project:
No inter-agent memory sharingβ eachagent_id
is isolated. If you need two agents to share knowledge, you'd have to build a sync layer on top. I haven't.Contradiction detection is a word-overlap heuristicβ real NLI would be better. It works for most cases but will miss semantic contradictions that don't share vocabulary.** Temporal reasoning requires explicit tagging**β the system can't automatically figure out that a stored fact is outdated. You have to tag it. Auto-inferring this from content is an open problem.Text onlyβ no images, audio, or structured data.** English-tuned by default**β the default models are English-only. Multilingual support means swapping toparaphrase-multilingual-MiniLM-L12-v2
or similar.workers=1 recommendedβ the embedding model is ~90MB per process. Running multiple workers multiplies RAM usage. For higher throughput, put a reverse proxy in front and scale horizontally with separate containers.Scoring thresholds are empirically tunedβ the weights work well in testing but I'm not certain they're the right defaults for every use case. If you tune them, I'd be interested in what you find.
If you're looking for somewhere to add real value:
NLI-based contradiction detectionβ even a small cross-encoder would beat the word-overlap heuristic** Async consolidation**β it already runs as a background asyncio task, but batching and priority queue improvements would help high-throughput setupsCross-agent shared semantic layerβ read-only organizational knowledge that multiple agents can draw on** Multilingual embedding support**β swap the default model, ensure the test suite covers non-English recall** LangChain / CrewAI adapters**β I built the OpenClaw plugin because that's what I use. Other frameworks need their own thin wrappers** Mem0/Zep comparison benchmark**β head-to-head on published benchmarks would make this more credible
PRs welcome. If you're unsure whether something fits, open an issue first.
PolyForm Small Business 1.0.0 β see LICENSE.md and COMMERCIAL.md.
Short version:
Free if you're an individual, researcher, hobbyist, or at a company with fewer than 100 people and under $1M revenuePaid license required if you're at a larger company using this commercially
If you fit the free tier, use it β keep the license file in your fork and you're done. If your company is over the threshold and you want to build on this, open an issue titled "Commercial licensing inquiry."
I picked this over MIT because this project took real work. If it's useful to you personally, I want you to have it free. If a corporation is making money off it, I'd like a share of that.
My name is Mohamed Mohamed (MrKali). I built this on a Raspberry Pi 5 I recovered from a corrupted SD card, using OpenClaw as my agent framework.
If you want to reach out, open an issue or reach me through the contacts on my GitHub profile. If you find bugs or have ideas, say so.
Thanks for looking at this.
β Mohamed