The Memory Sidecar has always been the invisible workhorse behind decentralized agent interactions—providing agent-agnostic memory persistence without coupling to any single AI framework. With v3.5.1, the focus shifts from feature velocity to operational maturity. This release, delivered via the hermes-memory-installer, is expressly designed for teams running memory sidecars in production at scale. If you’ve been treating your memory layer as a pet, it’s time to make it cattle.
This is not a feature drop. There are no new memory backends, no fancy compression algorithms, and no API-breaking changes. Instead, v3.5.1 closes long-standing gaps in resource governance, fault isolation, and observability—the three pillars that separate a prototype from a service.
Tighter Resource Governance
Memory sidecars are notoriously hungry when handling large vector embeddings or replay buffers. Earlier versions relied on the host OS to enforce limits, leading to cascading OOM kills. In v3.5.1, the hermes-memory-installer now generates systemd drop-in units that wire cgroup v2 memory and CPU limits directly into the sidecar process. You define the ceiling in the installer config; the installer ensures no single sidecar can starve the host or adjacent containers.
Process-Level Isolation
Each memory sidecar instance runs inside its own MemoryZone
—a lightweight namespace that includes separate mount, PID, and network namespaces. This prevents a rouge memory operation from leaking file handles or interfering with other sidecars on the same node. The installer transparently sets up the namespace scaffolding. The sidecar itself sees only its assigned resources and data directory.
Resilient I/O Paths
Memory writes that fail mid-flight were silently dropped in prior versions. v3.5.1 introduces an internal write-ahead log (WAL) with configurable durability. If the backing store (PostgreSQL, S3, or local disk) becomes unavailable, the WAL buffers pending mutations and replays them once the connection is restored. The installer exposes --wal-mode
(memory, disk, or sync) during deployment.
Observability Without Bloat
Structured JSON logging is now the default, with optional OpenTelemetry trace propagation for every memory read/write operation. The sidecar exports metrics (memory_sidecar_*
) to a dedicated endpoint at /metrics
on port 9610. The installer can configure a sidecar Prometheus scrape target automatically when using the built-in service discovery.
Assume you are deploying a sidecar instance for a heavy conversational agent. The hermes-memory-installer reads a YAML profile:
instance:
name: "chat-agent-mem"
storage:
type: postgres
connection_string: "postgresql://user:pass@pg:5432/memory"
limits:
memory_max: "2G"
cpu_quota: 1.5
wal:
mode: "disk"
flush_interval: "100ms"
observability:
metrics: true
tracing: false
Run the installer:
hermes-memory-installer apply --profile memory-sidecar-profile.yaml
The installer validates the profile, generates the systemd service file with the specified memory and CPU limits, starts the sidecar inside its own MemoryZone
, and—if metrics: true
—exposes the /metrics
endpoint. Any attempt by the sidecar to exceed memory_max
triggers an immediate cgroup OOM kill; the systemd unit restarts it with a configurable backoff.
Agent-agnostic memory only works if it stays up and stays predictable. v3.5.1 eliminates the three most common failure modes: unconstrained resource consumption, cross-instance contamination, and silent data loss during transient storage outages. The hermes-memory-installer now acts as both a deployment tool and a governance layer—applying the same hardening patterns whether you run one sidecar or one thousand.
If you are migrating from an earlier release, the installer handles upgrades in-place: it detects existing systemd units, updates the resource boundaries, and performs a graceful restart. No data migration scripts, no manual clean-up.
v3.5.1 is a boring release by design—and that’s its strength. It hardens the Memory Sidecar for the long haul. If you have been delaying moving your agent-memory layer into production because of stability concerns, now is the time. The hermes-memory-installer will handle the boilerplate. You can focus on building agents that actually remember.
Upgrade via the hermes-memory-installer CLI. Read the full changelog at docs.hermes-memory.dev.