Enki – memory for AI agents that keeps ~half as much and answers as well

Enki Labs released evaluation results for its closed-source memory engine Enki, showing comparable answer accuracy to mem0 on the LongMemEval-S benchmark while storing roughly half the facts (138 vs 283). In a 25-instance validated slice, Enki achieved 14/25 correct answers versus mem0's 12/25, with a notable advantage in multi-session reasoning (4/5 vs 2/5).

Enki is a memory engine for LLM agents. This repository publishes evaluation results only — the engine is closed-source. No configuration, internals, or methodology beyond what is described below is included here. Both systems ingest identical conversation histories from LongMemEval-S. Each system's retrieved memories are answered by the same model Claude Haiku and graded by the same LLM-as-judge, at equal retrieval depth K=10 . The only variable is the memory layer. Validated slice: 25 instances full-benchmark run in progress . | Question type | Enki | mem0 | |---|---|---| | Multi-session reasoning | 4 / 5 | 2 / 5 | | Knowledge update | 3 / 5 | 3 / 5 | | Single-session user | 3 / 5 | 3 / 5 | | Single-session assistant | 2 / 5 | 2 / 5 | | Single-session preference | 2 / 5 | 2 / 5 | Total | 14 / 25 | 12 / 25 | Storage: Enki answers from 0.49× the stored facts mem0 keeps on the same conversations mean 138 vs 283 . Standout: multi-session reasoning 4/5 vs 2/5 . Honest framing.This is a small, hand-validated slice; the overall margin 14 vs 12 is modest and within what a 25-item sample can show. The robust, repeatable result iscomparable answer accuracy at roughly half the memory footprint, with a clear multi-session advantage. Further evaluation is ongoing. Measured on a ~139-fact store, CPU-only no GPU , 240 samples: | Percentile | Latency ms | |---|---| | mean | 7.6 | | p50 | 6.1 | | p95 | 11.9 | | p99 | 13.0 | Full methodology and per-question results are available on request. Enki Labs UK · 2026