Memory Sidecar v3.5.1 is the operational hardening release for the public agent-agnostic memory system. If you’ve been running memory sidecars in production with multiple agents, you know the friction points: resource contention, recovery time after failures, and configuration drift across deployments. This release directly targets those areas with defaults tuned for stability and explicit controls for experienced operators. No new features were added for the sake of it—every change here had to prove itself under load in multi-tenant setups.
What Changed in This Release
The core improvement is in memory limit enforcement and error feedback loops. Previous versions relied on soft caps that could be exceeded during bursts, leading to cascading failures. v3.5.1 introduces a two-tier allocation model: a hard ceiling set via max_memory
and a soft watermark at 80% that triggers preemptive cleanup. The cleanup routine itself was rewritten to avoid holding locks during I/O, which reduces contention when the sidecar serves concurrent requests from different agents.
Agent-agnostic support means the sidecar must handle varying request patterns without assuming a specific payload format or lifecycle. The new async middleware dispatcher decouples memory operations from network I/O, allowing you to plug in agents via standardized WebSocket or gRPC endpoints. The dispatcher uses a bounded channel with backpressure—if the buffer fills, the sidecar rejects new connections gracefully instead of crashing or leaking memory.
Health checking received attention as well. The sidecar now exposes a /livez
endpoint that performs a shallow check (are the basic services running?) and a /readyz
endpoint that verifies the memory store can accept writes. Both return structured JSON with latency percentiles for the last 100 checks, which simplifies integration with service meshes and external monitoring tools.
Configuration That Survives Production
One pain point in earlier versions was the gap between configuration validation and runtime behavior. v3.5.1 introduces a --strict
mode that fails fast on unknown keys or out-of-range values during startup. Combined with a new config validate
subcommand in the CLI, you can lint your YAML files before deploying. The configuration file itself now supports environment variable interpolation and secret references for tokens and endpoints, so you can commit templates to version control without exposing credentials.
Here’s a minimal configuration that uses the new strict mode and recovery settings:
version: "3.5.1"
strict: true
limits:
max_memory: 512MB
soft_watermark_percent: 80
cleanup_batch_size: 100
recovery:
max_retries: 3
initial_backoff: 1s
max_backoff: 30s
circuit_breaker_failures: 5
endpoints:
- protocol: websocket
listen: ":8080"
max_connections: 100
- protocol: grpc
listen: ":9090"
health:
liveness_path: /livez
readiness_path: /readyz
check_interval: 30s
Notice recovery.circuit_breaker_failures
: after five consecutive failures from a single endpoint, the sidecar stops sending requests to that path for 60 seconds (configurable). This prevents a slow memory backend from taking down the entire sidecar pool.
Performance Notes from the Field
In our benchmark runs with a synthetic payload generator mimicking 10 concurrent agents, v3.5.1 maintained sub-5ms p99 latency for writes under 80% memory usage. At 95% usage, latency increased to 12ms as the cleanup routine fired, but no requests were dropped—they were queued in the bounded channel. Memory wastage stayed under 3% across all tests, compared to 15% in v3.5.0 when soft caps were exceeded.
The sidecar’s ability to handle mixed workloads (small vs large payloads, burst vs steady state) benefits from two new metrics exposed via Prometheus: memory_sidecar_cleanup_duration_seconds
and memory_sidecar_backpressure_events_total
. With these, you can tune the cleanup_batch_size
and initial_backoff
parameters without guesswork.
Upgrading from v3.5.0
The upgrade path is straightforward. The binary API and configuration schema remain backward compatible—deprecated fields from v3.4.0 were removed, but v3.5.0 settings continue to work. If you’re using the hermes-memory-installer
script, run hermes-memory-installer update --version 3.5.1
to pull the new sidecar image. The installer now supports rollback to the previous version with --rollback
in case of unforeseen issues in your environment.
Memory Sidecar v3.5.1 won’t solve every memory management problem you have, but it removes the class of failures that stem from imprecise limits and inadequate recovery. For teams already using the sidecar, the upgrade is low-risk and the operational gains are immediate. If you’re new to the tool, the strict configuration and agent-agnostic endpoints make it a solid foundation for building memory-aware systems without locking into a single framework.