MemoryOps AI is an enterprise-shaped, loop-engineered memory governance layer for AI assistants. It implements a ChatGPT-style memory lifecycle with capture, policy evaluation, typed storage, hybrid retrieval, controlled forgetting, auditability, and tenant isolation.
Most demos treat memory as a vector database. MemoryOps AI treats memory as governed state.
Tagline:Enterprise memory governance for AI assistants.Core claim:Memory is not a database. Memory is a governed decision system that decides what information is valuable enough to carry into the future.
Most AI "memory" demos do this:
chat message → vector database → retrieve later
MemoryOps AI does this:
WRITE PATH
Message → Extractor → Evaluator / Policy Broker → Write Service → Typed Memory Stores → Audit Log
READ PATH
Message → Retriever → Ranker → Context Composer → Response LLM
BACKGROUND
Decay Job → Reflection Agent → Conflict Resolver → Compression Worker
CROSS-CUTTING PLANES
Security · Governance · Observability · Evaluation · Reliability
The five verbs the system must demonstrate:
Capture → Store → Retrieve → Update → Forget (Governance wraps all five)
php
flowchart LR
M["chat message"] --> GW["Gateway"]
GW --> EX["Extractor"] --> PB["Policy Broker"] --> WS["Write Service"] --> ST[("Typed Store")]
GW --> RT["Retriever"] --> RK["Ranker"] --> CC["Context Composer"] --> RESP["Response"]
PB --> AUD[["Audit Log (append-only)"]]
WS --> AUD
ST -. background .-> BG["Decay · Reflection · Conflict · Compression"]
More diagrams (system architecture, lifecycle state machine, request sequence) are in docs/architecture.md.
These are non-negotiable and are enforced in code and tests.
Tenant isolation— User A's memory is never returned to User B or another tenant.** Deletion guarantee**— Deleted memories are never retrieved again.** Provenance**— Every stored memory traces back to its source message/document/manual input.** Graceful degradation**— Retrieval failure never blocks response generation.** Policy-before-storage**— Unsafe / secret-like content is filtered before it reaches the store.** Temporary chat**— Temporary sessions never write or retrieve memory.** Auditability**— Every memory lifecycle event produces an append-only audit event.** Explainability**— The system can show which memories affected a response.** Typed memory**— Episodic, semantic, procedural, project, knowledge, system memories differ.** Evaluation**— Memory quality is testable through a golden set, not just manual inspection.
See docs/architecture.md for the full design and where each invariant is enforced.
memoryops-ai/
apps/web/ Next.js frontend (chat, memories, governance, audit, loops, admin, architecture)
services/api/ FastAPI backend (gateway, extractor, policy broker, write/read path, audit)
services/worker/ Background jobs (decay, reflection, conflict resolution, compression)
packages/shared/ Shared types
infra/db/ Postgres + pgvector migrations and seed
infra/adr/ Architecture Decision Records
infra/observability/ OpenTelemetry / metrics notes
evals/ Golden + adversarial cases and the eval runner
docs/ architecture, security, governance, rollout, demo-script
docker-compose.yml
The API ships with an in-memory repository so you can run the write path and tests without Postgres.
cd services/api
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export MEMORYOPS_STORAGE=memory # default; uses in-memory store
uvicorn app.main:app --reload --port 8000
Run the invariant test suite:
cd services/api
pip install -r requirements-dev.txt
pytest -q
Run the eval harness against a running API (or in-process):
cd evals
python run_evals.py
cp .env.example .env
docker compose up --build
Compose runs migrations from infra/db/migrations
on first boot and sets
MEMORYOPS_STORAGE=postgres
for the API.
Retrieval uses a swappable embedding provider. The default is a deterministic, offline stub — no API key required — so tests and demos are reproducible.
export MEMORYOPS_EMBEDDING_PROVIDER=stub # default; deterministic, no key
export MEMORYOPS_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_EMBEDDING_MODEL=text-embedding-3-small
An unconfigured or failing provider degrades to the stub, and a query-embedding
failure degrades retrieval to keyword-only (retrieval_mode="fallback"
).
Extraction and conflict detection run through a provider-neutral LLM layer
(app/llm/
). The default is a deterministic, offline stub — no API key — so behavior is reproducible and tests never touch the network. Optional OpenAI, Anthropic, and Gemini adapters are used only when their key is set.
export MEMORYOPS_LLM_PROVIDER=stub # default; deterministic, no key
export MEMORYOPS_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=... ANTHROPIC_MODEL=claude-haiku-4-5-20251001
export MEMORYOPS_LLM_FALLBACK_TO_HEURISTIC=true # invalid JSON / failure → heuristic
LLM output is advisory: the deterministic policy broker runs after extraction and stays authoritative — a model can never override policy, and secret-like content is still blocked. See docs/provider-llm-adapters.md, docs/structured-memory-intelligence.md, and ADR-008.
Verify enforced Row-Level Security against a running Postgres:
python scripts/check_rls_policies.py # SKIPs cleanly if no DB is reachable
cd apps/web
npm install
npm run dev # http://localhost:3000
The frontend reads NEXT_PUBLIC_API_URL
(defaults to http://localhost:8000
).
MemoryOps deploys to Railway only. There is no Vercel path. One Railway
project (memoryops-ai
) runs five services:
| Service | Role | Source |
|---|---|---|
memoryops-web |
||
| Next.js frontend | apps/web/Dockerfile |
|
memoryops-api |
||
| FastAPI backend | services/api/Dockerfile |
|
memoryops-worker |
||
| Background loops | services/worker/Dockerfile |
|
| Railway Postgres | Store + pgvector | plugin |
| Railway Redis | Queue / cache | plugin |
Build/deploy is config-as-code under railway/. Docs:
docs/deployment/railway.md— topology, order, rollbackdocs/deployment/railway-env.md— env var matrixdocs/deployment/railway-smoke-test.md— post-deploy checks
Post-deploy verification:
python scripts/railway_smoke_test.py \
--api-url https://memoryops-api.up.railway.app \
--web-url https://memoryops-web.up.railway.app
- Full design spine: README, architecture/security/governance/rollout docs, 5 ADRs, DB schema.
- FastAPI write path: Gateway → Extractor → Policy Broker → Write Service → Memory Store → Audit. - Heuristic extractor + policy broker (works with no API keys); pluggable LLM adapter interface. - Typed memory classification, importance/confidence/sensitivity scoring, provenance capture.
- Policy decisions:
SAVE
,PENDING_APPROVAL
,BLOCK
,DROP_LOW_UTILITY
,UPDATE_EXISTING
,MERGE_WITH_EXISTING
. - Secret / PII detection blocks API keys and credentials before storage.
- Append-only audit log for every lifecycle event.
- Temporary chat short-circuits both read and write.
- Memory dashboard + admin/audit + architecture pages (frontend skeleton).
- Invariant test suite + eval harness scaffolding.
MemoryOps models memory as a set of governed loops rather than a passive store.
The core loops are:
- Memory Write Loop
- Memory Read Loop
- Governance Loop
- Evaluation Loop
- Release Gate Loop
- Continuous Learning Loop
Each loop has explicit states, policy gates, audit events, fallback behavior, and
evidence requirements. Loop definitions live in services/api/app/loops/
, loop
runs/events are exposed through /api/loops
, and the frontend includes a Loops page.
See docs/loop-engineering.md, docs/loop-contracts.md, and docs/release-loop.md.
MemoryOps supports an optional Headroom-powered context compression layer. Compression runs after policy checks, governance filtering, and context composition, and only on the composed context block — never the raw user message and never before the policy broker. It reduces tokens sent to the LLM while preserving MemoryOps invariants (provenance, deletion guarantee, tenant isolation, temporary-chat behavior, explainability metadata).
It is off by default and not a dependency — the app runs without
headroom-ai
installed, and any compression failure degrades safely to the uncompressed context.
pip install "headroom-ai[all]" # optional
export MEMORYOPS_CONTEXT_COMPRESSION=headroom # default: none
Each chat response carries a compression
block with estimated tokens saved and the compression ratio. See docs/token-compression.md, docs/integrations/headroom.md, and ADR-007. Headroom is Apache-2.0; MemoryOps integrates it via an adapter and does not vendor its source.
- Swappable embedding provider (
app/embeddings/
): deterministic offline stub + optional OpenAI. Hybrid retrieval: pgvector cosine (search_candidates
) + keyword overlap, blended by the ranker.- Per-memory
- response
score_breakdown
(retrieval_mode
hybrid
/fallback
/none
). Enforced Postgres Row-Level Security (migration004
,FORCE
- tenant policy + session GUC).- Expanded evals (semantic / keyword / archived / score-breakdown) + new tests; RLS test is DB-guarded.
- Provider-neutral LLM layer (
app/llm/
): deterministicStubProvider
default + optional OpenAI/Anthropic/Gemini adapters, selected byMEMORYOPS_LLM_PROVIDER
. Structured memory intelligence: schema-validated extraction + minimal conflict detection, with prompt registry and deterministic heuristic fallback.- Invalid JSON / provider failure / timeout degrades to the heuristic and never blocks chat; LLM output is advisory and cannot override the policy broker.
- New observability events (
llm_provider_call
,llm_provider_failure
,structured_output_invalid
,llm_fallback_used
,memory_extraction_structured
,conflict_detection_result
) +structured
/conflict
evals; tests need no API keys.
- Browser control plane over the governed lifecycle:
/memories
(filterable inventory),/memories/[id]
(detail + provenance + per-memory audit timeline + inline edit),/governance
(approval queue + recorded policy decisions),/audit
(tenant-wide append-only history). - Additive read routes:
GET /api/memories/{id}
,/{id}/provenance
,/{id}/audit
, plus amemory_id
filter on/api/audit
. Approve/reject/edit/ archive/restore/delete reuse the existing PATCH/DELETE — every action is audited and the policy broker stays authoritative. - Deletion guarantee holds in the UI: deleted memories are never listed or shown as active. Provenance is metadata only (no embeddings/secrets).
-
See docs/governance-ui.md,docs/memory-control-plane.md, andADR-009.
-
Background workers (
services/api/app/workers/
) maintain memoryafter capture, off the chat request path:decay(demote aged/low-confidence memory),** archive**(retire stale, non-pinned, not-recently-used memory),** conflict scan**(flag contradictions as review candidates),** deletion verification**(prove soft-deleted memory stays unreachable), and proposal-only** reflection**(off by default). - A tenant-scoped
runner
drives them:python -m app.workers.runner --tenant t1 --user u1 --job all
(returns a structuredWorkerRunReport
; non-zero exit on a failed job or deletion finding). - Every job is tenant scoped, idempotent, retry-safe, and audited; none resurrects deleted memory and none bypasses the policy broker. A worker failure never blocks chat.
-
See docs/background-lifecycle-workers.md,docs/memory-decay-policy.md,docs/deletion-verification.md, andADR-010.
-
A sixth lifecycle job — deletion compaction— clears a soft-deleted memory's content, normalized content, embedding/vector material, and provenance excerpt (after a retention window), whilepreserving the governance tombstone(id, tenant/user,
status='deleted'
,deleted_at
,source.kind
) and the full audit trail. Run it withpython -m app.workers.runner --tenant t1 --user u1 --job deletion_compaction
. - The purge is verified fail-closed: a still-reachable id, intact material, a missing tombstone, or a verification-path error all record evidence and flag the run — never a silent pass. - Honest scope: this is auditable content/vector compaction + retrieval-exclusion verification. It is** notcrypto-shred and does not**claim physical disk/page erasure or pgvector reindex orchestration. - See docs/deletion-compaction.md,docs/vector-purge-verification.md, andADR-011.
v0.7— physical deletion compaction + vector purge verification ✅** v0.8**— Railway worker runtime + scheduled lifecycle orchestration** v0.9**— retention policies + legal hold + consent-aware memory** v0.10**— assistant SDK + example apps** v1.0**— production-ready governed memory runtime
- Scheduled worker runtime with locks/leases, retries, and run history (v0.8).
- Hard purge / crypto-shred and pgvector index reclamation (beyond v0.7's auditable compaction).
- Governed reflection write path; cross-tenant scope enumeration for fleet scheduling.
- Observability + economics, AI PR review runtime, deployment hardening.
See docs/rollout.md and the build phases in CLAUDE.md.
MemoryOps AI includes an agentic engineering layer around the core memory system (never on the chat request path). It is inspired by three systems:
Hermes Agent— used as an operator/developer assistant layer for release checks, invariant audits, and guided project workflows. Seeand.hermes/skills/
docs/integrations/hermes-agent.md.agentic-swe-kit— used as a phase-gate framework for production engineering. MemoryOps maps to lifecycle phases covering cognitive design, memory architecture, evaluation, observability, security, reliability, governance, CI/CD for AI, and continuous learning. Seedocs/agentic-swe-kit-map.mdanddocs/phase-gates/.AI PR Review Agent— the pattern behind the** PR Invariant Evidence Gate**. Every PR that touches memory, policy, retrieval, deletion, security, migrations, or API contracts must provide evidence (tests / evals / docs / ADRs). Seescripts/pr_invariant_gate.py,.github/workflows/pr-invariant-evidence-gate.yml, anddocs/ai-pr-review-policy.md.
The goal: MemoryOps is not just an AI memory feature — it is a governed engineering system with release discipline, review gates, and operational safety. Overview: docs/integrations/README.md.
docs/architecture.md— write path, read path, planes, invariants.docs/loop-engineering.md— loop definitions, states, gates, evidence.docs/loop-contracts.md— LoopDefinition, LoopRun, LoopEvent contracts.docs/security.md— tenant isolation, secret detection, deletion guarantee.docs/governance.md— lifecycle, approvals, audit, retention.docs/rollout.md— phased delivery and production roadmap.docs/demo-script.md— the 6-step demo.infra/adr/— storage, retrieval, policy broker, observability, deletion ADRs.