# Graceful Degradation: Circuit Breakers for External API Dependencies

> Source: <https://dev.to/helperx/graceful-degradation-circuit-breakers-for-external-api-dependencies-5em1>
> Published: 2026-06-12 05:00:00+00:00

When your application depends on external APIs that you don't control, failures are not a question of "if" but "when." X's API rate-limits you. Your proxy provider has an outage. The AI model endpoint returns 503s for 20 minutes.

The question is: does one failure cascade into total system failure, or does your system degrade gracefully?

We built a circuit breaker system for HelperX that keeps healthy slots running when unhealthy ones fail. Here's the implementation.

Without circuit breakers, here's what happens when a proxy goes down:

One dead proxy degrades the entire system. With 200 slots, one bad proxy shouldn't affect 199 healthy ones.

A circuit breaker sits between your application and an external dependency. It has three states:

```
     ┌──────────┐
     │  CLOSED  │ ← Normal operation. Requests pass through.
     └────┬─────┘
          │ failures >= threshold
          ▼
     ┌──────────┐
     │   OPEN   │ ← Requests fail immediately. No network calls.
     └────┬─────┘
          │ after resetTimeout
          ▼
    ┌───────────┐
    │ HALF-OPEN │ ← Allow one test request through.
    └─────┬─────┘
          │
    ┌─────┴──────┐
    │ success?   │
    ├─yes────────┤──► CLOSED (resume normal)
    └─no─────────┘──► OPEN (wait longer)
class CircuitBreaker {
  constructor(name, options = {}) {
    this.name = name;
    this.state = 'closed';
    this.failures = 0;
    this.successes = 0;
    this.lastFailure = null;
    this.lastAttempt = null;

    this.threshold = options.threshold || 5;
    this.resetTimeout = options.resetTimeout || 60_000;
    this.halfOpenMax = options.halfOpenMax || 1;
    this.onStateChange = options.onStateChange || (() => {});
  }

  async execute(fn) {
    if (this.state === 'open') {
      if (Date.now() - this.lastFailure >= this.resetTimeout) {
        this.transition('half-open');
      } else {
        throw new CircuitOpenError(
          `Circuit ${this.name} is open. ` +
          `Resets in ${this.timeUntilReset()}ms`
        );
      }
    }

    if (this.state === 'half-open') {
      // Only allow limited requests through
      if (this.halfOpenAttempts >= this.halfOpenMax) {
        throw new CircuitOpenError(
          `Circuit ${this.name} is half-open, max attempts reached`
        );
      }
      this.halfOpenAttempts++;
    }

    this.lastAttempt = Date.now();

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (err) {
      this.onFailure(err);
      throw err;
    }
  }

  onSuccess() {
    this.failures = 0;
    this.successes++;
    if (this.state === 'half-open') {
      this.transition('closed');
    }
  }

  onFailure(err) {
    this.failures++;
    this.lastFailure = Date.now();
    this.lastError = err;

    if (this.failures >= this.threshold) {
      this.transition('open');
    }
  }

  transition(newState) {
    const oldState = this.state;
    this.state = newState;

    if (newState === 'half-open') {
      this.halfOpenAttempts = 0;
    }

    this.onStateChange({
      name: this.name,
      from: oldState,
      to: newState,
      failures: this.failures,
      lastError: this.lastError
    });
  }

  timeUntilReset() {
    if (this.state !== 'open') return 0;
    return Math.max(0,
      this.resetTimeout - (Date.now() - this.lastFailure)
    );
  }

  getStatus() {
    return {
      name: this.name,
      state: this.state,
      failures: this.failures,
      successes: this.successes,
      lastFailure: this.lastFailure,
      timeUntilReset: this.timeUntilReset()
    };
  }
}

class CircuitOpenError extends Error {
  constructor(message) {
    super(message);
    this.name = 'CircuitOpenError';
    this.isCircuitOpen = true;
  }
}
```

Each slot gets its own circuit breaker for each external dependency:

```
class SlotDependencies {
  constructor(slotId) {
    this.slotId = slotId;

    this.proxy = new CircuitBreaker(`${slotId}:proxy`, {
      threshold: 3,
      resetTimeout: 120_000,  // 2 minutes
      onStateChange: (e) => this.logStateChange(e)
    });

    this.ai = new CircuitBreaker(`${slotId}:ai`, {
      threshold: 5,
      resetTimeout: 60_000,   // 1 minute
      onStateChange: (e) => this.logStateChange(e)
    });

    this.api = new CircuitBreaker(`${slotId}:api`, {
      threshold: 3,
      resetTimeout: 300_000,  // 5 minutes (rate limits are longer)
      onStateChange: (e) => this.logStateChange(e)
    });
  }

  logStateChange(event) {
    const db = getDb(this.slotId);
    db.prepare(`
      INSERT INTO audit_log (id, module, action, status, detail, timestamp)
      VALUES (?, 'system', 'circuit_breaker', ?, ?, datetime('now'))
    `).run(
      crypto.randomUUID(),
      event.to === 'open' ? 'warning' : 'info',
      `${event.name}: ${event.from} → ${event.to} (${event.failures} failures)`
    );
  }
}
```

When Slot A's proxy circuit opens, Slot A stops sending requests through that proxy. Slots B through Z continue normally — they have their own circuit breakers with their own state.

``` js
async function executeModuleAction(slotId, module) {
  const deps = getSlotDependencies(slotId);

  // Step 1: Find a tweet to reply to (uses proxy)
  let tweet;
  try {
    tweet = await deps.proxy.execute(() =>
      searchTweets(slotId, module.config.query)
    );
  } catch (err) {
    if (err.isCircuitOpen) {
      logAudit(slotId, module.name, 'skipped',
        `Proxy circuit open, resets in ${deps.proxy.timeUntilReset()}ms`);
      return;
    }
    throw err;
  }

  // Step 2: Generate AI reply (uses AI endpoint)
  let reply;
  try {
    reply = await deps.ai.execute(() =>
      generateReply(slotId, tweet, module.config.persona)
    );
  } catch (err) {
    if (err.isCircuitOpen) {
      logAudit(slotId, module.name, 'skipped',
        `AI circuit open, resets in ${deps.ai.timeUntilReset()}ms`);
      return;
    }
    throw err;
  }

  // Step 3: Send the reply (uses proxy + API)
  try {
    await deps.proxy.execute(() =>
      deps.api.execute(() =>
        sendReply(slotId, tweet.id, reply)
      )
    );
  } catch (err) {
    if (err.isCircuitOpen) {
      logAudit(slotId, module.name, 'skipped',
        `Circuit open: ${err.message}`);
      return;
    }
    throw err;
  }

  logAudit(slotId, module.name, 'success', reply);
}
```

Each step of the action is wrapped in its own circuit breaker. If the AI is down but the proxy is fine, the system skips AI-dependent modules but can still run non-AI modules (scheduled posts, reposts).

The dashboard shows circuit breaker state for each slot:

``` js
function getSystemHealth() {
  const slots = getAllActiveSlots();

  return slots.map(slot => {
    const deps = getSlotDependencies(slot.id);
    return {
      slotId: slot.id,
      proxy: deps.proxy.getStatus(),
      ai: deps.ai.getStatus(),
      api: deps.api.getStatus(),
      healthy: ['proxy', 'ai', 'api']
        .every(dep => deps[dep].state === 'closed')
    };
  });
}
```

An operator sees at a glance which slots are healthy, which have open circuits, and when each circuit will attempt recovery.

Default thresholds aren't universal. We tuned ours based on failure patterns:

| Dependency | Threshold | Reset timeout | Why |
|---|---|---|---|
| Proxy | 3 failures | 2 min | Proxy failures are usually transient. Quick retry. |
| AI model | 5 failures | 1 min | AI endpoints recover fast. Higher threshold to absorb occasional 503s. |
| X API | 3 failures | 5 min | Rate limits last 15 min. Longer reset avoids hammering. |

The key insight: **reset timeout should match the expected recovery time of the dependency**, not an arbitrary number.

**1. One circuit breaker per dependency per tenant.** Global circuit breakers cause healthy tenants to suffer for unhealthy ones. Per-tenant isolation is the whole point.

**2. Log state transitions.** When a circuit opens, the audit log records it. This is the most valuable diagnostic information during incidents.

**3. Graceful skip > hard failure.** When a circuit is open, the action is skipped and logged — not retried, not errored, not queued. The scheduler moves to the next action. Queuing failures leads to thundering herds when the circuit closes.

**4. Nested circuit breakers work.** An action that uses proxy + API goes through both breakers. If either is open, the action is skipped. This handles compound failures cleanly.

**5. Half-open state prevents oscillation.** Without half-open, a circuit that closes immediately sends a burst of requests that may re-trigger the failure. Half-open allows exactly one test request, preventing the open/close/open oscillation.

[HelperX](https://helperx.app) uses per-slot circuit breakers to keep your accounts running independently — one bad proxy doesn't affect the rest. Free 30-day trial.
