{"slug": "graceful-degradation-circuit-breakers-for-external-api-dependencies", "title": "Graceful Degradation: Circuit Breakers for External API Dependencies", "summary": "The HelperX team built a circuit breaker system to prevent cascading failures when external API dependencies go down. The system monitors failures for each dependency per slot, automatically opening circuits after three failures and attempting recovery after a two-minute reset timeout. This ensures that a single failed proxy or API endpoint degrades only the affected slot rather than taking down all 200 healthy slots.", "body_md": "When your application depends on external APIs that you don't control, failures are not a question of \"if\" but \"when.\" X's API rate-limits you. Your proxy provider has an outage. The AI model endpoint returns 503s for 20 minutes.\n\nThe question is: does one failure cascade into total system failure, or does your system degrade gracefully?\n\nWe built a circuit breaker system for HelperX that keeps healthy slots running when unhealthy ones fail. Here's the implementation.\n\nWithout circuit breakers, here's what happens when a proxy goes down:\n\nOne dead proxy degrades the entire system. With 200 slots, one bad proxy shouldn't affect 199 healthy ones.\n\nA circuit breaker sits between your application and an external dependency. It has three states:\n\n```\n     ┌──────────┐\n     │  CLOSED  │ ← Normal operation. Requests pass through.\n     └────┬─────┘\n          │ failures >= threshold\n          ▼\n     ┌──────────┐\n     │   OPEN   │ ← Requests fail immediately. No network calls.\n     └────┬─────┘\n          │ after resetTimeout\n          ▼\n    ┌───────────┐\n    │ HALF-OPEN │ ← Allow one test request through.\n    └─────┬─────┘\n          │\n    ┌─────┴──────┐\n    │ success?   │\n    ├─yes────────┤──► CLOSED (resume normal)\n    └─no─────────┘──► OPEN (wait longer)\nclass CircuitBreaker {\n  constructor(name, options = {}) {\n    this.name = name;\n    this.state = 'closed';\n    this.failures = 0;\n    this.successes = 0;\n    this.lastFailure = null;\n    this.lastAttempt = null;\n\n    this.threshold = options.threshold || 5;\n    this.resetTimeout = options.resetTimeout || 60_000;\n    this.halfOpenMax = options.halfOpenMax || 1;\n    this.onStateChange = options.onStateChange || (() => {});\n  }\n\n  async execute(fn) {\n    if (this.state === 'open') {\n      if (Date.now() - this.lastFailure >= this.resetTimeout) {\n        this.transition('half-open');\n      } else {\n        throw new CircuitOpenError(\n          `Circuit ${this.name} is open. ` +\n          `Resets in ${this.timeUntilReset()}ms`\n        );\n      }\n    }\n\n    if (this.state === 'half-open') {\n      // Only allow limited requests through\n      if (this.halfOpenAttempts >= this.halfOpenMax) {\n        throw new CircuitOpenError(\n          `Circuit ${this.name} is half-open, max attempts reached`\n        );\n      }\n      this.halfOpenAttempts++;\n    }\n\n    this.lastAttempt = Date.now();\n\n    try {\n      const result = await fn();\n      this.onSuccess();\n      return result;\n    } catch (err) {\n      this.onFailure(err);\n      throw err;\n    }\n  }\n\n  onSuccess() {\n    this.failures = 0;\n    this.successes++;\n    if (this.state === 'half-open') {\n      this.transition('closed');\n    }\n  }\n\n  onFailure(err) {\n    this.failures++;\n    this.lastFailure = Date.now();\n    this.lastError = err;\n\n    if (this.failures >= this.threshold) {\n      this.transition('open');\n    }\n  }\n\n  transition(newState) {\n    const oldState = this.state;\n    this.state = newState;\n\n    if (newState === 'half-open') {\n      this.halfOpenAttempts = 0;\n    }\n\n    this.onStateChange({\n      name: this.name,\n      from: oldState,\n      to: newState,\n      failures: this.failures,\n      lastError: this.lastError\n    });\n  }\n\n  timeUntilReset() {\n    if (this.state !== 'open') return 0;\n    return Math.max(0,\n      this.resetTimeout - (Date.now() - this.lastFailure)\n    );\n  }\n\n  getStatus() {\n    return {\n      name: this.name,\n      state: this.state,\n      failures: this.failures,\n      successes: this.successes,\n      lastFailure: this.lastFailure,\n      timeUntilReset: this.timeUntilReset()\n    };\n  }\n}\n\nclass CircuitOpenError extends Error {\n  constructor(message) {\n    super(message);\n    this.name = 'CircuitOpenError';\n    this.isCircuitOpen = true;\n  }\n}\n```\n\nEach slot gets its own circuit breaker for each external dependency:\n\n```\nclass SlotDependencies {\n  constructor(slotId) {\n    this.slotId = slotId;\n\n    this.proxy = new CircuitBreaker(`${slotId}:proxy`, {\n      threshold: 3,\n      resetTimeout: 120_000,  // 2 minutes\n      onStateChange: (e) => this.logStateChange(e)\n    });\n\n    this.ai = new CircuitBreaker(`${slotId}:ai`, {\n      threshold: 5,\n      resetTimeout: 60_000,   // 1 minute\n      onStateChange: (e) => this.logStateChange(e)\n    });\n\n    this.api = new CircuitBreaker(`${slotId}:api`, {\n      threshold: 3,\n      resetTimeout: 300_000,  // 5 minutes (rate limits are longer)\n      onStateChange: (e) => this.logStateChange(e)\n    });\n  }\n\n  logStateChange(event) {\n    const db = getDb(this.slotId);\n    db.prepare(`\n      INSERT INTO audit_log (id, module, action, status, detail, timestamp)\n      VALUES (?, 'system', 'circuit_breaker', ?, ?, datetime('now'))\n    `).run(\n      crypto.randomUUID(),\n      event.to === 'open' ? 'warning' : 'info',\n      `${event.name}: ${event.from} → ${event.to} (${event.failures} failures)`\n    );\n  }\n}\n```\n\nWhen Slot A's proxy circuit opens, Slot A stops sending requests through that proxy. Slots B through Z continue normally — they have their own circuit breakers with their own state.\n\n``` js\nasync function executeModuleAction(slotId, module) {\n  const deps = getSlotDependencies(slotId);\n\n  // Step 1: Find a tweet to reply to (uses proxy)\n  let tweet;\n  try {\n    tweet = await deps.proxy.execute(() =>\n      searchTweets(slotId, module.config.query)\n    );\n  } catch (err) {\n    if (err.isCircuitOpen) {\n      logAudit(slotId, module.name, 'skipped',\n        `Proxy circuit open, resets in ${deps.proxy.timeUntilReset()}ms`);\n      return;\n    }\n    throw err;\n  }\n\n  // Step 2: Generate AI reply (uses AI endpoint)\n  let reply;\n  try {\n    reply = await deps.ai.execute(() =>\n      generateReply(slotId, tweet, module.config.persona)\n    );\n  } catch (err) {\n    if (err.isCircuitOpen) {\n      logAudit(slotId, module.name, 'skipped',\n        `AI circuit open, resets in ${deps.ai.timeUntilReset()}ms`);\n      return;\n    }\n    throw err;\n  }\n\n  // Step 3: Send the reply (uses proxy + API)\n  try {\n    await deps.proxy.execute(() =>\n      deps.api.execute(() =>\n        sendReply(slotId, tweet.id, reply)\n      )\n    );\n  } catch (err) {\n    if (err.isCircuitOpen) {\n      logAudit(slotId, module.name, 'skipped',\n        `Circuit open: ${err.message}`);\n      return;\n    }\n    throw err;\n  }\n\n  logAudit(slotId, module.name, 'success', reply);\n}\n```\n\nEach step of the action is wrapped in its own circuit breaker. If the AI is down but the proxy is fine, the system skips AI-dependent modules but can still run non-AI modules (scheduled posts, reposts).\n\nThe dashboard shows circuit breaker state for each slot:\n\n``` js\nfunction getSystemHealth() {\n  const slots = getAllActiveSlots();\n\n  return slots.map(slot => {\n    const deps = getSlotDependencies(slot.id);\n    return {\n      slotId: slot.id,\n      proxy: deps.proxy.getStatus(),\n      ai: deps.ai.getStatus(),\n      api: deps.api.getStatus(),\n      healthy: ['proxy', 'ai', 'api']\n        .every(dep => deps[dep].state === 'closed')\n    };\n  });\n}\n```\n\nAn operator sees at a glance which slots are healthy, which have open circuits, and when each circuit will attempt recovery.\n\nDefault thresholds aren't universal. We tuned ours based on failure patterns:\n\n| Dependency | Threshold | Reset timeout | Why |\n|---|---|---|---|\n| Proxy | 3 failures | 2 min | Proxy failures are usually transient. Quick retry. |\n| AI model | 5 failures | 1 min | AI endpoints recover fast. Higher threshold to absorb occasional 503s. |\n| X API | 3 failures | 5 min | Rate limits last 15 min. Longer reset avoids hammering. |\n\nThe key insight: **reset timeout should match the expected recovery time of the dependency**, not an arbitrary number.\n\n**1. One circuit breaker per dependency per tenant.** Global circuit breakers cause healthy tenants to suffer for unhealthy ones. Per-tenant isolation is the whole point.\n\n**2. Log state transitions.** When a circuit opens, the audit log records it. This is the most valuable diagnostic information during incidents.\n\n**3. Graceful skip > hard failure.** When a circuit is open, the action is skipped and logged — not retried, not errored, not queued. The scheduler moves to the next action. Queuing failures leads to thundering herds when the circuit closes.\n\n**4. Nested circuit breakers work.** An action that uses proxy + API goes through both breakers. If either is open, the action is skipped. This handles compound failures cleanly.\n\n**5. Half-open state prevents oscillation.** Without half-open, a circuit that closes immediately sends a burst of requests that may re-trigger the failure. Half-open allows exactly one test request, preventing the open/close/open oscillation.\n\n[HelperX](https://helperx.app) uses per-slot circuit breakers to keep your accounts running independently — one bad proxy doesn't affect the rest. Free 30-day trial.", "url": "https://wpnews.pro/news/graceful-degradation-circuit-breakers-for-external-api-dependencies", "canonical_source": "https://dev.to/helperx/graceful-degradation-circuit-breakers-for-external-api-dependencies-5em1", "published_at": "2026-06-12 05:00:00+00:00", "updated_at": "2026-06-12 05:42:22.739653+00:00", "lang": "en", "topics": ["ai-infrastructure", "mlops"], "entities": ["HelperX"], "alternates": {"html": "https://wpnews.pro/news/graceful-degradation-circuit-breakers-for-external-api-dependencies", "markdown": "https://wpnews.pro/news/graceful-degradation-circuit-breakers-for-external-api-dependencies.md", "text": "https://wpnews.pro/news/graceful-degradation-circuit-breakers-for-external-api-dependencies.txt", "jsonld": "https://wpnews.pro/news/graceful-degradation-circuit-breakers-for-external-api-dependencies.jsonld"}}