Memory Sidecar v3.5.1 lands as a focused operational hardening release for the public agent-agnostic memory layer. If you’ve been running earlier iterations in production with multi-agent systems, you know the friction: dangling connections from failed sessions, unbounded retries hammering storage backends, and inconsistent transaction windows when multiple agents write simultaneously. This release doesn’t introduce new storage engines or retrieval strategies—it locks down the seams where real-world deployments bleed availability.
The core target is fault isolation and deterministic cleanup. Two concrete changes drive this:
Session Scoped Graceful Shutdown
Previously, when an agent task died ungracefully, the sidecar held onto memory handles until a timeout forced closure. In v3.5.1, every session lock is tied to a unique ephemeral token, and the sidecar’s heartbeat goroutine now detects stale tokens within half the lease interval. The shutdown path emits a final checkpoint before releasing storage, preventing orphaned write buffers.
Retry with Bounded Jitter
Retry logic now uses a capped exponential backoff (max 30s) with per-request jitter derived from the session ID. This avoids retry storms when a transient storage outage resolves—each agent sees staggered attempts, and the storage side sees smoothed load. The max_retries
config defaults to 3 and can be overridden per deployment but is clamped at 5 to prevent tail latency blowups.
The most visible improvement is the updated configure API for the memory connector. Here’s how you set up a hardened instance with retry policy and transaction logging:
from memory_sidecar import MemoryClient
client = MemoryClient(
endpoint="unix:///var/run/mem-sidecar.sock",
session_config={
"heartbeat_interval_s": 10,
"heartbeat_timeout_s": 25,
"on_orhpan": "checkpoint_and_close"
},
retry_policy={
"max_retries": 3,
"base_delay_ms": 200,
"cap_delay_s": 30,
"jitter_factor": 0.25
},
transaction_log="/var/log/mem-sidecar-tx.jsonl"
)
Key points from this snippet:
on_orphan
replaces the old ambiguous timeout_action
—now you explicitly choose between checkpoint_and_close
or abort_quietly
.
transaction_log
enables an append-only log of all memory mutations. This is off by default; enabling it costs ~2% throughput but gives you precise replay for debugging or audit trails.
Because Memory Sidecar remains agent-agnostic, these improvements automatically protect any adapter implementing the standard gRPC contract. Whether your agents are built on LangChain, custom Python workers, or Node services, the sidecar enforces the same session lifecycle and retry boundaries. The installer (hermes-memory-installer
) now includes a --hardening-profile
flag that applies these defaults to all new deployments.
If you’re upgrading from v3.4.x, note that the timeout_ms
field in session config is deprecated in favor of the two-parameter heartbeat model. The sidecar logs a warning if you use the old field but still respects it during migration. Remove timeout_ms
when you can schedule a config cleanup.
Memory Sidecar v3.5.1 is available now through the hermes-memory-installer package. Run pip install --upgrade hermes-memory-installer
and review the new config
templates in /etc/memory-sidecar/
. No database migrations required—this is purely a runtime stabilization cut.