# Designing a Resilient Media Orchestration System: Event-Driven Architecture with Real-Time AI

> Source: <https://dev.to/claudia-ve/designing-a-resilient-media-orchestration-system-event-driven-architecture-with-real-time-ai-1l0j>
> Published: 2026-05-29 10:07:40+00:00

Every content team eventually faces the same wall: you've got six platforms to publish on, a dozen data sources feeding in, and some AI pipeline generating drafts — but none of it talks to each other without duct tape and cron jobs.

What you need isn't more tools. You need an **orchestration layer**.

Over the past few months, I've been designing a system that ingests content from multiple sources, processes it through AI models, and distributes it across platforms — all in real time, with fault tolerance built in from day one. Here's what the architecture looks like and the decisions that mattered.

The naive approach is a linear pipeline: fetch → process → publish. That works until one step fails and the entire chain collapses. Real-world content operations involve:

A linear pipeline can't handle this. You need an event-driven architecture.

The system uses a **publish-subscribe event bus** as the backbone. Every component emits and consumes events without knowing about each other.

```
┌──────────────┐     ┌─────────────────┐     ┌───────────────┐
│  Ingestors   │────▶│   Event Bus     │────▶│  Processors   │
│ (RSS, API,   │     │ (Redis Streams) │     │ (AI, Format,  │
│  Webhook)    │     │                 │     │  Classify)    │
└──────────────┘     └────────┬────────┘     └───────┬───────┘
                              │                       │
                              ▼                       ▼
                     ┌──────────────────────────────────┐
                     │        State Store (Postgres)     │
                     │  + Dead Letter Queue (Redis)      │
                     └──────────────────────────────────┘
```

Let's break down each layer.

Each source gets its own adapter that normalizes into a standard `ContentItem`

schema:

```
interface ContentItem {
  id: string;
  source: 'rss' | 'api' | 'webhook' | 'manual';
  sourceUrl?: string;
  rawContent: string;
  metadata: Record<string, unknown>;
  collectedAt: Date;
}
```

The adapters are stateless and emit a `content.ingested`

event. If an adapter crashes, the event bus doesn't care — it just won't receive events until the adapter restarts.

**Key decision:** We chose **Redis Streams** over Kafka for the event bus. The tradeoff is throughput for operational simplicity. For a media orchestration system handling hundreds (not millions) of items per hour, Redis Streams gives us consumer groups, message acknowledgments, and a dead letter mechanism without the operational overhead of a Kafka cluster.

Processors subscribe to specific event types. Each processor is a chain of **middleware-style transforms**:

```
class Pipeline {
  private transforms: Transform[];

  async execute(item: ContentItem): Promise<ContentItem> {
    let result = item;
    for (const transform of this.transforms) {
      try {
        result = await transform.execute(result);
      } catch (error) {
        await this.errorHandler.handle(error, result);
        return result; // or break, depending on severity
      }
    }
    return result;
  }
}
```

The real trick is **idempotency** — if a processor crashes mid-way and gets restarted, it shouldn't reprocess the same item. Each item carries a processing version hash. If the hash matches the current pipeline version, the processor skips it.

This is where things get interesting. LLM calls are **unreliable by nature** — they time out, return malformed JSON, or take 30 seconds on a simple summarization.

Our approach: **circuit breakers with graduated fallbacks**.

```
class AICircuitBreaker {
  private failures: number = 0;
  private lastFailure: Date | null = null;
  private threshold: number = 3;
  private resetTimeout: number = 60000; // 1 minute

  async call(prompt: string, options: AICallOptions): Promise<string> {
    if (this.isOpen()) {
      return this.fallbackStrategy(options); // template-based instead
    }
    try {
      const result = await this.model.call(prompt, options);
      this.failures = 0;
      return result;
    } catch (err) {
      this.failures++;
      this.lastFailure = new Date();
      if (this.failures >= this.threshold) {
        this.openCircuit();
      }
      throw err;
    }
  }
}
```

When the circuit is open, the system falls back to deterministic templates instead of failing outright. Your 2 PM newsletter still goes out — it just uses the standard intro paragraph instead of an AI-generated one.

Each platform target is a **plugin**. The plugin interface is dead simple:

```
interface PlatformPlugin {
  name: string;
  validate(item: FormattedContent): ValidationResult;
  publish(item: FormattedContent): Promise<PublishReceipt>;
}
```

Plugins can be enabled/disabled at runtime via config. We can route the same content item to Twitter, Telegram, and a blog simultaneously — each handled by its own plugin with its own retry logic and rate limiting.

Three properties make this approach resilient:

This architecture handles the *mechanical* parts well — fetch, process, publish, retry. But what it doesn't solve on its own is the *orchestration* layer: deciding *what* to publish, *when*, and *where*, based on actual performance data.

That's where tools like **Rationale** come in. Rationale is an AI media orchestration engine that sits on top of architectures like this — it ingests from your existing pipelines, uses AI to optimize content strategy, and coordinates multi-channel distribution with built-in resilience patterns.

The architecture I've described is the foundation. Rationale provides the brain.

*Platform architecture patterns evolve fast. The key is starting with something that survives failures gracefully — because in media operations, something *will* break. Build for that, and everything else is just optimization.*

*Check out Rationale for the orchestration layer that ties it all together.*
