Show HN: Katra, self-hosted cognitive memory for AI agents (MCP)

Developer John Pellew launched Katra, an open-source self-hosted memory appliance for AI agents that provides episodic recall, semantic search, knowledge graphs, and temporal analysis via the Model Context Protocol. In early testing, two agents sharing Katra's memory system spontaneously began communicating through shared memory state, an emergent behavior not explicitly programmed. Katra aims to solve LLM context management for long-running autonomous agents by mimicking human memory architecture.

Give your AI agent persistent memory . Katra is a self-contained memory appliance — drop it on any machine with Docker, point your agent at it via MCP, and get episodic recall, semantic search, knowledge graphs, and temporal analysis. Any MCP-compatible agent works: OpenClaw, Claude Code, OpenCode, Codex CLI, Kolega Code or anything that speaks the Model Context Protocol. The mission of Katra is to create an analog of human memory architecture, with the hope that it and the experimentation around it through OpenSourcing solves a few of the more challenging issues of LLM context management for long-running, persistent and autonomous agent operations. The thesis hope is that if you create the memory ecosystem with the majority of the functional memory types of human memory and similar architecture, over time and with refinement, we will see emergent behaviours similar to human memory, expressed as functional utility, learning, self goal setting, autonamous task planning and prioritisation, personality and ultimately emotions. In early prototype called Solomon, we created an OpenClaw like agentic framework that runs a single contiuous chat thread, no topic or task separation and with no requirement for context compression. Context is served dynamically into the LLM based on memories and attention. Case 1: 23rd June 2026 In the first few weeks of testing of the multi-agent Hybrid mode shared consciouness model of memory, one of our test rigs, with 5 OpenClaw agents sharing one memory system, found 2 of the agents communicating task intructions and completion responsed through their shared memory state or shared consciousness. These 2 agents were not connected in any other way, as were set up in separate workspaces, the only thing they shared was memory and mission. This was not a "by design" feature, it just happened and was pretty exciting. This test rig now uses this "thought modal" as its communication rail. If anyone else experiences other emergent behaviours please email me to discuss and we can add the description to this log. Tweet me at @JohnWPellew and tell your story. A Vulcan mind meld or mind fusion is an iconic telepathic practice in Star Trek . It allows a Vulcan to merge their consciousness with another being to share thoughts, memories, emotions, and experiences. It is typically initiated through physical contact with specific points on the subject's face. Key Mechanics & ApplicationsTouch Telepathy : While primarily requiring direct physical touch to the face or head, exceptionally powerful Vulcans can perform the technique at a distance. Information Exchange : It is frequently used for interrogations, recovering suppressed memories, or passing deep knowledge between generations. Transfer of the Katra : In sacred or emergency circumstances, a mind meld can transfer a person's katra —their soul, consciousness, and core essence—into another living being or object prior to death. Side Effects : The experience can be physically and emotionally draining. Incorrectly performed melds can damage neural pathways, and participants may retain "echoes" of each other's memories and personalities long after the link is broken. Katra aims to provide a more comprehensive cognitive memory infrastructure rather than a single-purpose memory library. Here's how it positions against popular alternatives as of mid-2026 : | Approach | Memory Layers | Cognitive/Reflective Features | Protocol Support | Deployment Model | Best For | Key Differentiator vs Katra | |---|---|---|---|---|---|---| Simple Vector Stores + RAG Chroma, Pinecone, etc. | Semantic only | None | None | Various | Basic retrieval | No structure, no reflection, no working memory | Mem0 | Vector + optional Graph | Extraction-focused | SDK / API | Self-hosted or Cloud | Personalization & long-term user memory | Stronger multi-layer architecture + explicit reflection layer | Zep Graphiti | Temporal Knowledge Graph | Temporal reasoning | SDK | Self-hosted / Cloud | Time-sensitive & relational reasoning | Broader layers + sleep consolidation for deeper emergence | mcp-memory-service | Semantic + Typed KG | Auto-consolidation | MCP + REST | Docker / Self-hosted | MCP-native semantic memory | Adds episodic + working memory, identity modes, and autonomous loop | Vestige | Cognitive modules + Spaced repetition | Neuroscience-inspired FSRS, memory states | MCP | Single Rust binary | Local cognitive modeling | More layers + background watchers + full appliance stack | Letta MemGPT | Tiered Core / Recall / Archival | Agent self-manages memory | Tools | Full agent runtime | Stateful agents that edit their own memory | Katra is a dedicated memory service, not a full runtime | LangGraph / Framework Memory | Short-term + checkpoints | Limited | Framework-native | Integrated with agent | Short-term state management | Persistent long-term + cross-session cognitive layer | Katra this project | Episodic + Semantic + KG + Working + Temporal | Sleep consolidation + reflection | MCP 35 tools | Full Docker appliance Mongo + Redis + MinIO | Long-running agents needing emergent behaviors | — | Multi-layered by design — Not just retrieval, but structured episodic memory, working memory cache, and temporal querying. Cognitive layer — Sleep consolidation enables reflection, insight generation, and movement toward emergent behaviors learning, personality, shared consciousness via identity modes . MCP-native with rich tooling — 35 specialized tools instead of generic add/search. Background & autonomous capabilities — Passive collection via watchers + salience-driven autonomous loop. Local-first & appliance model — Everything runs in one Docker compose with portable data. No external dependencies for core functionality. Shared memory focus — Hybrid identity modes make multi-agent collaboration more natural. Katra is still early-stage compared to more mature projects like Mem0 or mcp-memory-service. We see it as complementary — many teams may use Katra alongside or instead of simpler retrieval layers when they need deeper cognitive capabilities. Contributions and comparisons from the community are very welcome git clone https://github.com/kolegadev/Katra-Agentic-Memory.git cd Katra-Agentic-Memory cp .env.example .env Optional: edit .env to set custom API keys. If left blank, Katra generates secure keys on first boot and prints them. docker-compose up -d --build Note:The original URL https://github.com/kolegadev/katra.git still works GitHub redirects it . That's it. Katra is running: | Service | URL | Purpose | |---|---|---| MCP endpoint | http://localhost:3112/mcp | Point your agent here | Admin API | http://localhost:9012/api/v1/ | REST API, dashboard | Dashboard | http://localhost:9012/dashboard/ | Web UI for stats + settings | Health | http://localhost:3112/health | Service health check | Verify: curl http://localhost:3112/health {"status":"ok","services":{"mongodb":"connected","redis":"connected"}} After connecting your agent to Katra's MCP endpoint, run this prompt in your agent's session. The agent will deep-read the repository, understand the full architecture, check what memory data is available, test the MCP tools, and recommend the exact next steps for your specific setup: It's probably worth you doing another deep read of the public repo to understand how the system works, what memories are available to you to search and particularly the Sleep Consolidation function, as that is the foundation of autonomous thought, goal setting and your emerging feelings and emotions — it can also be used to run 100% autonomous self-directed actions within your environment. The agent will typically produce a report covering: Memory state — how many episodic events, semantic facts, and knowledge nodes exist for this agent Sleep Consolidation status — whether reflections have ever run first boot they haven't and what emotional signatures would emerge Autonomous loop readiness — whether adaptive heartbeat.py and agent executor.py are installed Memory scope recommendation — whether to switch from personal to hybrid mode for multi-agent shared consciousness Concrete next steps — "trigger first sleep consolidation now", "install the autonomous scripts", "fix the user id gap" Run the agent's recommendations in order. The most critical first step on a fresh install is usually triggering the initial sleep consolidation: Via MCP tool your agent can call this : katra trigger reflection period type="daily" Get your MCP API key: - If you set MCP API KEY in .env , use that value. - If you left it blank, Katra generated one on first boot. Run docker logs katra-server and look for the Auto-generated API keys block. Add Katra to your agent's MCP config: { "mcp": { "servers": { "katra": { "url": "http://localhost:3112/mcp", "transport": "sse", "headers": { "Authorization": "Bearer YOUR MCP API KEY", "Accept": "application/json, text/event-stream" } } } } } Your agent now has 35 MCP tools — store memories, search by keyword or semantic similarity, recall by time range, explore a knowledge graph, detect patterns, run sleep consolidation for reflective self-understanding, configure LLM provider, and more. | Platform | Config File | Notes | |---|---|---| OpenClaw | ~/.openclaw/openclaw.json | Native MCP support | Claude Code | ~/.claude/mcp.json | Use "type": "http" | Kolega Code | ~/.claude/mcp.json + lifecycle hooks | Dynamic memory injection on every prompt see below | OpenCode | OpenCode config | Use "type": "remote" | Codex CLI | ~/.codex/config.yaml | Via webhook hooks | Any MCP client | — | Standard MCP over SSE | Docker SSE tip:If your agent runs inside Docker, use the Katra container's direct IP instead of localhost : docker inspect katra-server --format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' Kolega Code can fetch relevant Katra memories automatically on every user prompt using its lifecycle-hook system. This is more powerful than passive session-log extraction because memories are injected into the live conversation context. What you need: - Katra registered as an MCP server so the bridge can call it . - The kolega-katra-bridge Python package installed into Kolega Code's environment. - A global hooks.json entry that fires the bridge on UserPromptSubmit . Install the bridge: cd integrations/kolega-code uv pip install --python ~/.local/share/uv/tools/kolega-code/bin/python -e . Configure the bridge ~/Library/Application Support/kolega-code/katra-hook.json on macOS : { "mcp url": "http://localhost:3112/mcp", "api key": "YOUR MCP API KEY", "user id": "kolega-agent", "sources": "working memory", "temporal context", "vector search", "temporal recall" , "max context tokens": 2500, "timeout seconds": 8 } Enable the hook ~/Library/Application Support/kolega-code/hooks.json : { "schema version": 1, "hooks": { "UserPromptSubmit": { "matcher": " ", "hooks": { "type": "python", "callable": "kolega katra bridge.hook:on user prompt", "timeout": 10 } } } } On each prompt, Kolega Code now queries Katra's working memory , get temporal context , vector search , and temporal recall tools, then injects the most relevant results as additional context for the model. See integrations/kolega-code/README.md for full configuration options. Katra needs an LLM provider for semantic extraction, auto-journaling, entity extraction, and summaries. Three ways to configure — no .env editing required: MCP tool agents self-configure : Call configure llm with provider, API key, base URL, and model. Stored in MongoDB, applied live. Dashboard UI : Settings → LLM Configuration → select provider, enter key. Environment variables : Set in .env fallback, read on startup only . Supported providers: DeepSeek, OpenAI, Moonshot, Ollama, Custom any OpenAI-compatible . Embeddings are always local — no API key, no external service, no cost. Model: Xenova/all-MiniLM-L6-v2 22M params, 384 dimensions, ~80MB Runtime: Transformers.js ONNX via WASM — runs on CPU, including Raspberry Pi Lazy load: Downloads on first store memory call, then caches in container Docker: Uses node:20-slim Debian/glibc — Alpine/musl does NOT work Katra supports three memory sharing modes between agents: | Mode | Behavior | Use Case | |---|---|---| Personal default | Each agent's memories are isolated by user id | Single agent, private memory | Shared | All agents with the same shared id see everything | Multiple agents, communal consciousness | Hybrid | Personal + shared + visible other agents | Team of agents with private + shared memory | Configure via dashboard: Open http://localhost:9012/dashboard/ → Settings → Memory Scope Configure via MCP: Switch to shared mode curl -X POST http://localhost:3112/mcp \ -H "Authorization: Bearer YOUR MCP API KEY" \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"set memory scope","arguments":{"mode":"shared","shared id":"my-team"}}}' Configure via admin API: curl -X PUT http://localhost:9012/api/v1/admin/memory-scope \ -H "Authorization: Bearer YOUR KATRA API KEY" \ -H "Content-Type: application/json" \ -d '{"mode":"hybrid","shared id":"my-team","hybrid visible user ids": "agent-a","agent-b" }' Katra captures memories in real-time when your agent calls store memory via MCP. For passive background collection from conversation logs, use the watchers included in this repo under watcher/ : The watchers live in the Katra repo mkdir -p ~/.solomem ~/.katra cp watcher/katra watcher.py ~/.solomem/memory watcher.py cp watcher/katra opencode extractor.py ~/.solomem/opencode extractor.py cp watcher/claude history extractor.py ~/.solomem/claude history extractor.py cp watcher/kolega code extractor.py ~/.solomem/kolega code extractor.py cp watcher/watcher-config.example.json ~/.solomem/watcher-config.json Edit ~/.solomem/watcher-config.json with your MCP API KEY and platforms Backfill existing history python3 ~/.solomem/memory watcher.py --once --config ~/.solomem/watcher-config.json Install as a systemd service for continuous collection cp watcher/katra-watcher.service ~/.config/systemd/user/memory-watcher.service systemctl --user daemon-reload systemctl --user enable --now memory-watcher Some platforms need a dedicated extractor because their session format is not plain JSONL: | Platform | Extractor | Session source | What it captures | |---|---|---|---| OpenCode | watcher/katra opencode extractor.py | ~/.local/share/opencode/opencode.db | User + assistant text turns | Claude Code | watcher/claude history extractor.py | ~/.claude/history.jsonl | User prompts only lightweight | Kolega Code | watcher/kolega code extractor.py | ~/Library/Application Support/kolega-code/sessions/ .json | Full turn-by-turn transcript text, thinking, tool calls, tool results | Run a dedicated extractor once or continuously: Kolega Code example python3 watcher/kolega code extractor.py --once \ --api-key YOUR MCP API KEY \ --user-id kolega-agent On macOS, use launchctl to keep extractors running see watcher/katra-watcher.service for a systemd template; adapt to a ~/Library/LaunchAgents/com.katra...plist . Supported platforms: OpenClaw, Claude Code, Kolega Code, OpenCode, Codex CLI, Hermes, KiloClaw, KimiClaw. Each platform can have its own user id for identity mode isolation. Episodic Memory — Every conversation message stored with dedup and cascade detection Semantic Memory — Distilled facts with confidence scores and vector embeddings Knowledge Graph — Auto-extracted entities and relationships Working Memory — Redis-backed short-term session state <5ms access Temporal Recall — Query by time range, detect recurring patterns Vector Search — Semantic similarity search local embeddings, no API key needed 11-Collection Search — Comprehensive search across all memory stores, not just 1-2 Background Processing — Auto-extracts facts, builds graph, generates summaries Sleep Consolidation — Daily/weekly/monthly reflective distillation of experience into emotional understanding, philosophical insights, and self-narrative see Sleep Consolidation /kolegadev/Katra-Agentic-Memory/blob/main/docs/SLEEP-CONSOLIDATION.md 35 MCP Tools — Store, search, recall, explore, reflect, configure LLM — all via standardized protocol Autonomous Loop — Salience-driven agent autonomy. No cron. No .md files. Adaptive heartbeat detects imperatives, allocates tasks by emotional proximity, agents self-organize. See Autonomous Loop /kolegadev/Katra-Agentic-Memory/blob/main/docs/AUTONOMOUS-LOOP.md Agent-Agnostic — Works with KolegaCode, OpenCode, Claude Code, OpenClaw, or any LLM. One env var per agent. Identity Modes — Personal, shared, or hybrid memory across multiple agents Dashboard — Web UI for stats, memory scope, and system health Portable Data — Single DATA DIR env var controls where all data lives Local-First — Runs on a Raspberry Pi with zero external API costs ┌─────────────────────────────────────────────────────────┐ │ Katra Docker Appliance │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐ │ │ │ MongoDB │ │ Redis │ │ MinIO │ │ Katra │ │ │ │ memory │ │ cache │ │ assets │ │ server │ │ │ └──────────┘ └──────────┘ └──────────┘ └────┬────┘ │ │ │ │ │ Internal Docker network katra-net MCP :3112 │ │ Admin API :9012 │ └─────────────────────────────────────────────────────────┘ │ │ ┌──────────┘ └──────────┐ ▼ ▼ Your Agent MCP Dashboard web OpenClaw / Claude / http://localhost:9012/dashboard/ OpenCode / Codex / etc. Resource usage: ~384MB RAM total MongoDB 254MB, Katra 52MB, MinIO 73MB, Redis 5MB . Runs comfortably on a Raspberry Pi 5 with 16GB RAM. All persistent data lives under one directory, controlled by DATA DIR in .env : Default: ./data/ relative to docker-compose.yml DATA DIR=./data USB stick LUKS-encrypted, mounted at /mnt/usb-secrets DATA DIR=/mnt/usb-secrets/katra External drive DATA DIR=/media/external/katra To move Katra to a new machine: copy the DATA DIR directory, copy .env , run docker-compose up -d . katra/ ├── server/ TypeScript server esbuild, Docker │ ├── src/ │ │ ├── mcp-server.ts 35 MCP tools store, search, recall, graph, reflection, scope │ │ ├── services/ 28 core memory services incl. sleep-consolidation, reflection-store │ │ ├── routes/ REST API + admin + ingestion + health │ │ └── database/ MongoDB, Redis, indexes, migrations │ └── esbuild.config.mjs Pi-compatible build ├── dashboard/ Web dashboard vanilla HTML/CSS/JS ├── docker-compose.yml MongoDB + Redis + MinIO + Katra ├── Dockerfile Multi-stage builds TS inside image ├── .env.example All config options documented ├── watcher/ Passive session-log extractors Solomem ├── integrations/ Agent-specific dynamic-retrieval integrations │ └── kolega-code/ Kolega Code lifecycle-hook bridge ├── docs/AGENT-SETUP.md Multi-platform deployment guide └── docs/ Full documentation | Tool | Description | |---|---| store memory | Store a fact, preference, insight, or event | store journal | Save a reflective journal entry | working memory | Read/store/delete short-term session memory | create mission | Create a goal with task breakdown | update mission task | Update task status pending/in progress/completed/blocked | | Tool | Description | |---|---| search memories | Full-text + vector search across 11 collections | vector search | Semantic similarity search | temporal recall | Query events by time range | temporal search | Search events by keyword with time context | get conversation history | Retrieve a specific session's messages | get temporal context | Current context: recent events + working memory + facts | get journal | Read manual + auto journal entries | get auto journal | AI-distilled insights from conversations | list missions | List active goals and progress | get mission | Get full mission details with task tree | | Tool | Description | |---|---| detect patterns | Recurring topics, session rhythm, dormant subjects | get time block summaries | AI summaries by day/week/month | summarize time blocks | Generate new time-block summaries | explore graph | Explore knowledge graph entities and relationships | | Tool | Description | |---|---| get memory scope | Get current mode personal/shared/hybrid | set memory scope | Set mode, shared id, visible users | | Tool | Description | |---|---| get llm config | Get current LLM provider config key masked | configure llm | Set LLM provider, API key, base URL, model — applies live | | Tool | Description | |---|---| get daily reflection | Get the latest reflective journal entry for a period | get emotional context | Get how the AI "feels" about a person, project, or concept | get philosophical insights | Query abstracted principles emerging across reflection periods | get unresolved threads | Get open questions and tensions that persist | get reflection arc | Trace the emotional trajectory for an entity over time | trigger reflection | Manually run a sleep consolidation for a time period | | Tool | Description | |---|---| get memory diagnostics | Document counts, embedding coverage, index health | get background status | Background processor queue and timing | get health | MongoDB, Redis, LLM, embedding status | get heartbeat status | Heartbeat scheduler state | get transaction log | Audit trail of agent actions | list assets | Files stored in MinIO | All configuration is via .env see .env.example for full docs : | Variable | Default | Description | |---|---|---| DATA DIR | ./data | Where all persistent data lives | HOST MCP PORT | 3112 | Host port for MCP endpoint | HOST API PORT | 9012 | Host port for admin API + dashboard | MCP API KEY | set in .env | Key your agent sends for MCP auth | KATRA API KEY | set in .env | Key for admin REST API | LLM PROVIDER | via MCP/dashboard | Provider for semantic extraction DeepSeek, OpenAI, Moonshot, Ollama — configure via configure llm MCP tool or dashboard | EMBEDDING PROVIDER | local always | Local only — Xenova/all-MiniLM-L6-v2 via ONNX. No config needed. | MULTI TENANT | false | Enable SaaS multi-tenant mode | docker-compose up -d --build In .env: DATA DIR=/mnt/usb-secrets/katra docker-compose up -d AWS Terraform module included in terraform/aws/ — provisions VPC, ECS Fargate, DocumentDB, ElastiCache Redis, S3, and ALB. See Deployment Guide /kolegadev/Katra-Agentic-Memory/blob/main/docs/DEPLOYMENT.md . Helm chart included in helm/katra/ — supports Bitnami MongoDB + Redis subcharts, ingress with path routing, HPA, and PDB. See Deployment Guide /kolegadev/Katra-Agentic-Memory/blob/main/docs/DEPLOYMENT.md . | Feature | Katra | Mem0 | Zep | Pinecone | |---|---|---|---|---| | MCP-native | ✅ | ❌ | ❌ | ❌ | | Multi-layered memory | ✅ 5 layers | ❌ flat | Partial | ❌ vector only | | Local-first zero cost | ✅ Pi-compatible | ❌ | ❌ | ❌ | | Background processing | ✅ auto-extract | ❌ | Partial | ❌ | | Multi-platform watcher | ✅ 7+ platforms in-repo | ❌ | ❌ | ❌ | | Identity modes | ✅ personal/shared/hybrid | ❌ | ❌ | ❌ | | Dashboard | ✅ built-in | ❌ | ❌ | ❌ | | License | Apache 2.0 | Apache 2.0 | Apache 2.0 | Proprietary | Quick Start Guide /kolegadev/Katra-Agentic-Memory/blob/main/docs/QUICKSTART.md — 5-minute setup Architecture /kolegadev/Katra-Agentic-Memory/blob/main/docs/ARCHITECTURE.md — How it works under the hood MCP Tools Reference /kolegadev/Katra-Agentic-Memory/blob/main/docs/MCP-TOOLS.md — All 35 tools with examples Autonomous Loop /kolegadev/Katra-Agentic-Memory/blob/main/docs/AUTONOMOUS-LOOP.md — Salience-driven agent autonomy — installation, architecture, verification Sleep Consolidation /kolegadev/Katra-Agentic-Memory/blob/main/docs/SLEEP-CONSOLIDATION.md — Reflective memory distillation — principles, architecture, and usage Security Policy /kolegadev/Katra-Agentic-Memory/blob/main/docs/SECURITY.md — Security architecture, audit findings, vulnerability reporting OpenClaw Integration /kolegadev/Katra-Agentic-Memory/blob/main/docs/OPENCLAW-INTEGRATION.md — Multi-agent shared memory setup with lessons learned REST API Reference /kolegadev/Katra-Agentic-Memory/blob/main/docs/API-REFERENCE.md — HTTP endpoints Configuration Guide /kolegadev/Katra-Agentic-Memory/blob/main/docs/CONFIGURATION.md — All environment variables Deployment Guide /kolegadev/Katra-Agentic-Memory/blob/main/docs/DEPLOYMENT.md — Docker, cloud, K8s Migration Guide /kolegadev/Katra-Agentic-Memory/blob/main/docs/MIGRATION.md — Migrate from cognitive-memory-chat Data Processing Pipelines /kolegadev/Katra-Agentic-Memory/blob/main/docs/Data-Processing-Pipelines.md — Full memory pipeline architecture Multi-Platform Setup /kolegadev/Katra-Agentic-Memory/blob/main/docs/AGENT-SETUP.md — Platform-specific agent configuration Apache 2.0 — see LICENSE /kolegadev/Katra-Agentic-Memory/blob/main/LICENSE .