@hazeljs/agent 1.0.1: Production Hardening for Real Deployments

wpnews.pro

We are shipping @hazeljs/agent 1.0.1 — a patch release focused on operational durability, resilience consolidation, and production observability. If you run agents behind a load balancer, need human-in-the-loop tool approvals, or want circuit breakers and traces in production, this release is for you.

1.0.1 is backward compatible. No breaking API changes — only new optional configuration, factories, and exports.

A minimal production bootstrap with Redis-backed state, durable approvals, and strict event handling:

import { HazelApp } from '@hazeljs/core';
import { Agent, Tool, AgentModule, AgentService } from '@hazeljs/agent';
import { createClient } from 'redis';

@Agent({ name: 'ops-agent', description: "'Operations assistant' })"
class OpsAgent {
  @Tool({ description: "'Restart a service', requiresApproval: true })"
  async restartService(input: { service: string }) {
    return { restarted: input.service, at: new Date().toISOString() };
  }
}

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

await AgentModule.forRootAsync({
  redis: { client: redis },
  useRedisApprovals: true,
  runtime: {
    strictEventHandlers: true,
    enableCircuitBreaker: true,
    observabilityProvider: myObservabilityProvider, // optional
  },
});

const app = new HazelApp({ modules: [AgentModule] });
const agentService = app.get(AgentService);

agentService.on('agent.tool.approval.requested', (event) => {
  // Approve from any replica — request is stored in Redis
  agentService.approveToolExecution(event.data.requestId, 'admin');
});

await agentService.execute('ops-agent', 'Restart the payment worker');

Same agent code as 1.0.0 — only module wiring changes for production.

@hazeljs/agent

1.0.0 shipped a full agent runtime: execution loop, tools, memory/RAG, multi-agent graphs, A2A, streaming, and guardrails hooks. What it did not optimize for was multi-instance production:

Area	1.0.0 default	Production risk
Execution state	In-memory	Lost on process restart
Tool approvals	In-memory `Map` s
Broken across replicas; lost on crash
Retry / rate limit	Local utilities	Drift from `@hazeljs/resilience`

Observability	Local metrics + events	No OTel spans or LLM cost bridge
RAG errors	Silently returned `[]`

Hard to debug in prod

1.0.1 closes these gaps without changing how you define agents or tools.

New factory helpers pick the right persistence backend from config or environment:

import {
  createStateManager,
  createStateManagerFromEnv,
  AgentModule,
} from '@hazeljs/agent';

// Sync — when you already have a connected Redis client
const stateManager = createStateManager({
  backend: 'redis',
  redisClient,
});

// Async — connects from REDIS_URL
const stateManager = await createStateManagerFromEnv({
  redisUrl: process.env.REDIS_URL,
});

Environment variables:

Variable	Values	Behavior
`AGENT_STATE_BACKEND`
`memory` , `redis` , `database`

Explicit backend selection
`REDIS_URL`
Redis connection URL	Auto-selects Redis when set

AgentModule.forRoot()

and ** AgentModule.forRootAsync()** wire Redis state when a client or URL is provided:

import { AgentModule } from '@hazeljs/agent';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

await AgentModule.forRootAsync({
  redis: { client: redisClient },
  useRedisApprovals: true,
  runtime: {
    strictEventHandlers: process.env.NODE_ENV === 'production',
  },
});

forRoot()

redis.client

forRootAsync()

REDIS_URL

or redis.url

before bootSee PERSISTENCE.md for Redis, Prisma, and hybrid setups.

Tool approvals no longer live only in process memory. A new ** IApprovalStore** interface supports:

InMemoryApprovalStore

RedisApprovalStore

When AgentModule

is configured with Redis, approvals are stored in Redis automatically so human-in-the-loop flows work across replicas.

New exports:

import {
  IApprovalStore,
  InMemoryApprovalStore,
  RedisApprovalStore,
  createApprovalStore,
} from '@hazeljs/agent';

Local retry and rate-limit utilities now delegate to ** @hazeljs/resilience**:

RetryHandler

→ RetryPolicy

RateLimiter

→ TokenBucketLimiter

The public API of RetryHandler

and RateLimiter

is preserved (marked @deprecated

for direct resilience use in a future minor). The deprecated ** circuit-breaker.js** shim was removed.

Circuit breaker behavior is now validated end-to-end: repeated LLM failures through AgentRuntime.execute()

open the circuit and subsequent calls fail fast with CircuitBreakerError

.

Failed agent executions (AgentState.FAILED

) now propagate as errors through the circuit breaker and retry layers instead of returning silently.

Optional peers were added for production tracing and cost tracking:

@hazeljs/observability

(optional)@opentelemetry/api

(optional)When you pass an ** observabilityProvider** in runtime config, the agent emits OpenTelemetry spans:

Span	When
`agent.execute`
Full agent run
`agent.tool.execute`
Tool invocation
`agent.llm`
LLM chat call

Span attributes include agent.name

, agent.execution_id

, agent.tool.name

, and session metadata. LLM usage is bridged to ** trackCost()** when the provider is configured.

import { AgentRuntime } from '@hazeljs/agent';

const runtime = new AgentRuntime({
  observabilityProvider: myObservabilityProvider,
  llmProvider,
});

No hard dependency on OTel — spans are no-ops unless a provider is injected.

RAG search failures no longer silently return an empty context. The runtime now:

agent.rag.failed

AgentEventType.RAG_QUERY_FAILED

)ragContext: []

(graceful degradation)AgentEventEmitter

accepts ** strictEventHandlers: true**. When enabled, errors in event handlers propagate instead of being swallowed — recommended for production.

If @hazeljs/ai

never registers __HAZELJS_AI_ENHANCED_SERVICE__

, AgentService

now logs a clear error after 500ms instead of failing silently. Set runtime.llmProvider

explicitly or ensure the AI module loads first.

State managers use minimal interfaces instead of any

:

RedisClientLike

PrismaClientLike

DatabaseStateManager

Safer to wire real clients without losing type checking at the boundary.

474 tests pass with coverage thresholds enforced. New integration coverage includes:

RedisApprovalStore

Test locations:

tests/integration/production-hardening.test.ts

tests/integration/hardening-coverage.test.ts

Jest uses tsconfig.jest.json

for monorepo-friendly typechecking; optional @hazeljs/eval

peer is stubbed during tests.

npm install @hazeljs/agent@1.0.1

No code changes required for existing apps. To adopt production features incrementally:

redis: { client }

or await AgentModule.forRootAsync({ redis: { url } })

useRedisApprovals: true

with a Redis client@hazeljs/observability

and pass observabilityProvider

runtime.strictEventHandlers: true

in productionNot in 1.0.1, planned for future minors:

@hazeljs/flow

__HAZELJS_AI_ENHANCED_SERVICE__

Questions or feedback? Open an issue or join the discussion on GitHub.

source & further reading

dev.to — original article I Built an Awesome List for Open-Source GEO Tools. The Licence Check Cut the Popular Ones. CSA Says Harden Your Networks for the AI Storm. Here's How to Verify You Actually Did. Google Expands Gemini Spark as a Persistent AI Agent Across Its Product Ecosystem

@hazeljs/agent 1.0.1: Production Hardening for Real Deployments

Run your AI side-project on zahid.host