Show HN: MemoryOps – governed memory infrastructure for AI assistants

wpnews.pro

MemoryOps AI is an enterprise-shaped, loop-engineered memory governance layer for AI assistants. It implements a ChatGPT-style memory lifecycle with capture, policy evaluation, typed storage, hybrid retrieval, controlled forgetting, auditability, and tenant isolation.

Most demos treat memory as a vector database. MemoryOps AI treats memory as governed state.

Tagline:Enterprise memory governance for AI assistants.Core claim:Memory is not a database. Memory is a governed decision system that decides what information is valuable enough to carry into the future.

Most AI "memory" demos do this:

chat message → vector database → retrieve later

MemoryOps AI does this:

WRITE PATH
Message → Extractor → Evaluator / Policy Broker → Write Service → Typed Memory Stores → Audit Log

READ PATH
Message → Retriever → Ranker → Context Composer → Response LLM

BACKGROUND
Decay Job → Reflection Agent → Conflict Resolver → Compression Worker

CROSS-CUTTING PLANES
Security · Governance · Observability · Evaluation · Reliability

The five verbs the system must demonstrate:

Capture → Store → Retrieve → Update → Forget   (Governance wraps all five)
php
flowchart LR
    M["chat message"] --> GW["Gateway"]
    GW --> EX["Extractor"] --> PB["Policy Broker"] --> WS["Write Service"] --> ST[("Typed Store")]
    GW --> RT["Retriever"] --> RK["Ranker"] --> CC["Context Composer"] --> RESP["Response"]
    PB --> AUD[["Audit Log (append-only)"]]
    WS --> AUD
    ST -. background .-> BG["Decay · Reflection · Conflict · Compression"]

More diagrams (system architecture, lifecycle state machine, request sequence) are in docs/architecture.md.

These are non-negotiable and are enforced in code and tests.

Tenant isolation— User A's memory is never returned to User B or another tenant.** Deletion guarantee**— Deleted memories are never retrieved again.** Provenance**— Every stored memory traces back to its source message/document/manual input.** Graceful degradation**— Retrieval failure never blocks response generation.** Policy-before-storage**— Unsafe / secret-like content is filtered before it reaches the store.** Temporary chat**— Temporary sessions never write or retrieve memory.** Auditability**— Every memory lifecycle event produces an append-only audit event.** Explainability**— The system can show which memories affected a response.** Typed memory**— Episodic, semantic, procedural, project, knowledge, system memories differ.** Evaluation**— Memory quality is testable through a golden set, not just manual inspection.

See docs/architecture.md for the full design and where each invariant is enforced.

memoryops-ai/
  apps/web/            Next.js frontend (chat, memories, governance, audit, loops, admin, architecture)
  services/api/        FastAPI backend (gateway, extractor, policy broker, write/read path, audit)
  services/worker/     Background jobs (decay, reflection, conflict resolution, compression)
  packages/shared/     Shared types
  infra/db/            Postgres + pgvector migrations and seed
  infra/adr/           Architecture Decision Records
  infra/observability/ OpenTelemetry / metrics notes
  evals/               Golden + adversarial cases and the eval runner
  docs/                architecture, security, governance, rollout, demo-script
  docker-compose.yml

The API ships with an in-memory repository so you can run the write path and tests without Postgres.

cd services/api
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
export MEMORYOPS_STORAGE=memory          # default; uses in-memory store
uvicorn app.main:app --reload --port 8000

Run the invariant test suite:

cd services/api
pip install -r requirements-dev.txt
pytest -q

Run the eval harness against a running API (or in-process):

cd evals
python run_evals.py
cp .env.example .env
docker compose up --build

Compose runs migrations from infra/db/migrations

on first boot and sets MEMORYOPS_STORAGE=postgres

for the API.

Retrieval uses a swappable embedding provider. The default is a deterministic, offline stub — no API key required — so tests and demos are reproducible.

export MEMORYOPS_EMBEDDING_PROVIDER=stub     # default; deterministic, no key
export MEMORYOPS_EMBEDDING_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_EMBEDDING_MODEL=text-embedding-3-small

An unconfigured or failing provider degrades to the stub, and a query-embedding failure degrades retrieval to keyword-only (retrieval_mode="fallback"

).

Extraction and conflict detection run through a provider-neutral LLM layer (app/llm/

). The default is a deterministic, offline stub — no API key — so behavior is reproducible and tests never touch the network. Optional OpenAI, Anthropic, and Gemini adapters are used only when their key is set.

export MEMORYOPS_LLM_PROVIDER=stub          # default; deterministic, no key
export MEMORYOPS_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=...   ANTHROPIC_MODEL=claude-haiku-4-5-20251001
export MEMORYOPS_LLM_FALLBACK_TO_HEURISTIC=true   # invalid JSON / failure → heuristic

LLM output is advisory: the deterministic policy broker runs after extraction and stays authoritative — a model can never override policy, and secret-like content is still blocked. See docs/provider-llm-adapters.md, docs/structured-memory-intelligence.md, and ADR-008.

Verify enforced Row-Level Security against a running Postgres:

python scripts/check_rls_policies.py        # SKIPs cleanly if no DB is reachable
cd apps/web
npm install
npm run dev          # http://localhost:3000

The frontend reads NEXT_PUBLIC_API_URL

(defaults to http://localhost:8000

).

MemoryOps deploys to Railway only. There is no Vercel path. One Railway project (memoryops-ai

) runs five services:

Service	Role	Source
`memoryops-web`
Next.js frontend	`apps/web/Dockerfile`
`memoryops-api`
FastAPI backend	`services/api/Dockerfile`
`memoryops-worker`
Background loops	`services/worker/Dockerfile`
Railway Postgres	Store + pgvector	plugin
Railway Redis	Queue / cache	plugin

Build/deploy is config-as-code under railway/. Docs:

docs/deployment/railway.md— topology, order, rollbackdocs/deployment/railway-env.md— env var matrixdocs/deployment/railway-smoke-test.md— post-deploy checks

Post-deploy verification:

python scripts/railway_smoke_test.py \
  --api-url https://memoryops-api.up.railway.app \
  --web-url https://memoryops-web.up.railway.app

Full design spine: README, architecture/security/governance/rollout docs, 5 ADRs, DB schema.
FastAPI write path: Gateway → Extractor → Policy Broker → Write Service → Memory Store → Audit. - Heuristic extractor + policy broker (works with no API keys); pluggable LLM adapter interface. - Typed memory classification, importance/confidence/sensitivity scoring, provenance capture.
Policy decisions: SAVE

,PENDING_APPROVAL

,BLOCK

,DROP_LOW_UTILITY

,UPDATE_EXISTING

,MERGE_WITH_EXISTING

. - Secret / PII detection blocks API keys and credentials before storage.

Append-only audit log for every lifecycle event.
Temporary chat short-circuits both read and write.
Memory dashboard + admin/audit + architecture pages (frontend skeleton).
Invariant test suite + eval harness scaffolding.

MemoryOps models memory as a set of governed loops rather than a passive store.

The core loops are:

Memory Write Loop
Memory Read Loop
Governance Loop
Evaluation Loop
Release Gate Loop
Continuous Learning Loop

Each loop has explicit states, policy gates, audit events, fallback behavior, and evidence requirements. Loop definitions live in services/api/app/loops/

, loop runs/events are exposed through /api/loops

, and the frontend includes a Loops page.

See docs/loop-engineering.md, docs/loop-contracts.md, and docs/release-loop.md.

MemoryOps supports an optional Headroom-powered context compression layer. Compression runs after policy checks, governance filtering, and context composition, and only on the composed context block — never the raw user message and never before the policy broker. It reduces tokens sent to the LLM while preserving MemoryOps invariants (provenance, deletion guarantee, tenant isolation, temporary-chat behavior, explainability metadata).

It is off by default and not a dependency — the app runs without headroom-ai

installed, and any compression failure degrades safely to the uncompressed context.

pip install "headroom-ai[all]"            # optional
export MEMORYOPS_CONTEXT_COMPRESSION=headroom   # default: none

Each chat response carries a compression

block with estimated tokens saved and the compression ratio. See docs/token-compression.md, docs/integrations/headroom.md, and ADR-007. Headroom is Apache-2.0; MemoryOps integrates it via an adapter and does not vendor its source.

Swappable embedding provider ( app/embeddings/

): deterministic offline stub + optional OpenAI. Hybrid retrieval: pgvector cosine (search_candidates

) + keyword overlap, blended by the ranker.- Per-memory

responsescore_breakdown

(retrieval_mode

hybrid

/fallback

/none

). Enforced Postgres Row-Level Security (migration004

,FORCE

tenant policy + session GUC).- Expanded evals (semantic / keyword / archived / score-breakdown) + new tests; RLS test is DB-guarded.

Provider-neutral LLM layer ( app/llm/

): deterministicStubProvider

default + optional OpenAI/Anthropic/Gemini adapters, selected byMEMORYOPS_LLM_PROVIDER

. Structured memory intelligence: schema-validated extraction + minimal conflict detection, with prompt registry and deterministic heuristic fallback.- Invalid JSON / provider failure / timeout degrades to the heuristic and never blocks chat; LLM output is advisory and cannot override the policy broker.

New observability events ( llm_provider_call

,llm_provider_failure

,structured_output_invalid

,llm_fallback_used

,memory_extraction_structured

,conflict_detection_result

) +structured

/conflict

evals; tests need no API keys.

Browser control plane over the governed lifecycle: /memories

(filterable inventory),/memories/[id]

(detail + provenance + per-memory audit timeline + inline edit),/governance

(approval queue + recorded policy decisions),/audit

(tenant-wide append-only history). - Additive read routes: GET /api/memories/{id}

,/{id}/provenance

,/{id}/audit

, plus amemory_id

filter on/api/audit

. Approve/reject/edit/ archive/restore/delete reuse the existing PATCH/DELETE — every action is audited and the policy broker stays authoritative. - Deletion guarantee holds in the UI: deleted memories are never listed or shown as active. Provenance is metadata only (no embeddings/secrets).

See docs/governance-ui.md,docs/memory-control-plane.md, andADR-009.
Background workers ( services/api/app/workers/

) maintain memoryafter capture, off the chat request path:decay(demote aged/low-confidence memory),** archive**(retire stale, non-pinned, not-recently-used memory),** conflict scan**(flag contradictions as review candidates),** deletion verification**(prove soft-deleted memory stays unreachable), and proposal-only** reflection**(off by default). - A tenant-scoped runner

drives them:python -m app.workers.runner --tenant t1 --user u1 --job all

(returns a structuredWorkerRunReport

; non-zero exit on a failed job or deletion finding). - Every job is tenant scoped, idempotent, retry-safe, and audited; none resurrects deleted memory and none bypasses the policy broker. A worker failure never blocks chat.

See docs/background-lifecycle-workers.md,docs/memory-decay-policy.md,docs/deletion-verification.md, andADR-010.
A sixth lifecycle job — deletion compaction— clears a soft-deleted memory's content, normalized content, embedding/vector material, and provenance excerpt (after a retention window), whilepreserving the governance tombstone(id, tenant/user,status='deleted'

,deleted_at

,source.kind

) and the full audit trail. Run it withpython -m app.workers.runner --tenant t1 --user u1 --job deletion_compaction

. - The purge is verified fail-closed: a still-reachable id, intact material, a missing tombstone, or a verification-path error all record evidence and flag the run — never a silent pass. - Honest scope: this is auditable content/vector compaction + retrieval-exclusion verification. It is** notcrypto-shred and does not**claim physical disk/page erasure or pgvector reindex orchestration. - See docs/deletion-compaction.md,docs/vector-purge-verification.md, andADR-011.

v0.7— physical deletion compaction + vector purge verification ✅** v0.8**— Railway worker runtime + scheduled lifecycle orchestration** v0.9**— retention policies + legal hold + consent-aware memory** v0.10**— assistant SDK + example apps** v1.0**— production-ready governed memory runtime

Scheduled worker runtime with locks/leases, retries, and run history (v0.8).
Hard purge / crypto-shred and pgvector index reclamation (beyond v0.7's auditable compaction).
Governed reflection write path; cross-tenant scope enumeration for fleet scheduling.
Observability + economics, AI PR review runtime, deployment hardening.

See docs/rollout.md and the build phases in CLAUDE.md.

MemoryOps AI includes an agentic engineering layer around the core memory system (never on the chat request path). It is inspired by three systems:

Hermes Agent— used as an operator/developer assistant layer for release checks, invariant audits, and guided project workflows. Seeand.hermes/skills/

docs/integrations/hermes-agent.md.agentic-swe-kit— used as a phase-gate framework for production engineering. MemoryOps maps to lifecycle phases covering cognitive design, memory architecture, evaluation, observability, security, reliability, governance, CI/CD for AI, and continuous learning. Seedocs/agentic-swe-kit-map.mdanddocs/phase-gates/.AI PR Review Agent— the pattern behind the** PR Invariant Evidence Gate**. Every PR that touches memory, policy, retrieval, deletion, security, migrations, or API contracts must provide evidence (tests / evals / docs / ADRs). Seescripts/pr_invariant_gate.py,.github/workflows/pr-invariant-evidence-gate.yml, anddocs/ai-pr-review-policy.md.

The goal: MemoryOps is not just an AI memory feature — it is a governed engineering system with release discipline, review gates, and operational safety. Overview: docs/integrations/README.md.

docs/architecture.md— write path, read path, planes, invariants.docs/loop-engineering.md— loop definitions, states, gates, evidence.docs/loop-contracts.md— LoopDefinition, LoopRun, LoopEvent contracts.docs/security.md— tenant isolation, secret detection, deletion guarantee.docs/governance.md— lifecycle, approvals, audit, retention.docs/rollout.md— phased delivery and production roadmap.docs/demo-script.md— the 6-step demo.infra/adr/— storage, retrieval, policy broker, observability, deletion ADRs.

source & further reading

github.com — original article

Show HN: MemoryOps – governed memory infrastructure for AI assistants

Run your AI side-project on zahid.host