{"slug": "temporal-workflow-streams-stream-ai-agent-output-in-real-time", "title": "Temporal Workflow Streams: Stream AI Agent Output in Real Time", "summary": "Temporal Technologies announced Workflow Streams at Replay 2026, now in Public Preview, enabling real-time output from durable Temporal workflows without external infrastructure like Redis or separate SSE servers. The feature uses Temporal's own Signal and Update primitives to provide offset-based resumption, allowing clients to reconnect and continue streaming from where they left off after failures. This solves a key problem for long-running AI agents where crash recovery and mid-run output visibility are critical.", "body_md": "Your AI agent is halfway through a 90-second run — three LLM calls deep, tool results coming in, two sub-agents fanning out. The user sees a spinner. Then the server restarts. The workflow replays correctly from checkpoint, but the user has no idea what’s been happening. No output reached them before the crash.\n\nThis is the problem [Temporal Workflow Streams](https://temporal.io/blog/replay-2026-product-announcements) was built to fix. Announced at Replay 2026 and now in Public Preview, Workflow Streams gives durable Temporal workflows a real-time output channel without requiring Redis, a separate SSE server, or any custom state management. The stream is built on Temporal’s own Signal and Update primitives — which means it inherits the same durability guarantees as the workflow itself.\n\n## The Old Workaround\n\nBefore Workflow Streams, teams streaming progress from inside a Temporal Activity to a frontend did something like this:\n\n- Activity publishes tokens to Redis pub/sub during execution\n- A separate SSE server subscribes to Redis and streams to the client\n- Hope the Redis connection and SSE server survive the duration of the run\n\nIt works — until something breaks. Server restart: Redis connection drops. Client refreshes: the SSE stream dies and is gone. You’re back to manually stitching state together. The bitter irony is that Temporal already knows exactly what your workflow is doing, step by step, in durable history — but that knowledge was inaccessible to the outside world mid-run. Workflow Streams changes that.\n\n## How Workflow Streams Works\n\nWorkflow Streams is a contrib library in the Temporal Python SDK (Go and .NET supported; Java and TypeScript in pre-release). The architecture has three roles:\n\n**The Workflow (host):** Owns an append-only, offset-addressed event log**Publishers:** Append events — can be the Workflow itself, its Activities, or external processes via`WorkflowStreamClient`\n\n**Subscribers:** Connect to the Workflow ID, optionally filter by topic, and consume events by long-polling from a stored offset\n\nUnder the hood, Temporal’s existing message primitives do the work: Signals carry publishes, Updates serve the long-poll subscriptions, and a Query exposes the current global offset. The stream IS the workflow history — no external pub/sub layer needed.\n\n``` js\n# Workflow: create a stream and let activities publish to it\nfrom temporalio.contrib.workflow_streams import WorkflowStream\n\n@workflow.defn\nclass AgentWorkflow:\n    def __init__(self):\n        self._stream = WorkflowStream(self)\n\n    @workflow.run\n    async def run(self, prompt: str) -> str:\n        return await workflow.execute_activity(\n            call_llm_and_stream,\n            args=[prompt, self._stream],\n            start_to_close_timeout=timedelta(minutes=5),\n        )\n\n# Client: subscribe and receive events as they arrive\nfrom temporalio.contrib.workflow_streams import WorkflowStreamClient\n\nasync with WorkflowStreamClient(client, workflow_id=\"agent-123\") as sub:\n    async for event in sub.events(topic=\"tokens\"):\n        print(event.data, end=\"\", flush=True)\n```\n\nThe client resumes from its last-seen offset automatically. If it disconnects and reconnects, it picks up exactly where it left off — no tokens dropped.\n\n## The Decisive Advantage: Offset-Based Resumption\n\nThis is where Workflow Streams beats plain SSE and WebSocket for long-running agent scenarios. Both give you real-time output, but neither survives failures without a separate state store:\n\n| SSE | WebSocket | Temporal Workflow Streams | |\n|---|---|---|---|\n| Survives server crash | No | No | Yes |\n| Offset-based resumption | No | Requires Redis | Built-in |\n| Bidirectional | No | Yes | Yes (via Signals) |\n| Observability | DIY | DIY | Built into Temporal UI |\n| Latency | Very low | Very low | ~100ms (tunable) |\n\nFor agents that run for more than a few seconds — LLM chains, multi-step coding agents, data pipelines — crash recovery matters. SSE is the right tool for a 2-second response. It is not the right tool for a 10-minute agentic run.\n\n## Latency, Tuning, and History Cost\n\nThe default configuration targets a slow-moving UI, not real-time token streaming. The key parameter to tune is `batch_interval`\n\n, which defaults to 2 seconds:\n\n```\n# Lower batch_interval from the default 2s for token streaming\nstream = WorkflowStream(self, batch_interval=timedelta(milliseconds=100))\n```\n\nExpected round-trip after tuning: roughly 100ms. That is fine for the typical AI agent UI — not for voice or sub-50ms interactive scenarios.\n\nOne trade-off to understand: each published batch is one Signal, each subscriber poll is one Update. Both accumulate against Temporal’s per-run history limit. For agents that run for hours, plan for Continue-As-New from the start — Workflow Streams carries the essential log offset across the boundary automatically.\n\n## Where This Fits in the Temporal AI Stack\n\nWorkflow Streams is the piece that completes Temporal’s answer to production AI agents. Combined with the other Replay 2026 announcements — [Serverless Workers on Lambda](https://temporal.io/blog/replay-2026-product-announcements) and Standalone Activities for durable background jobs — and first-class integrations with [OpenAI Agents SDK](https://temporal.io/blog/announcing-openai-agents-sdk-integration) (GA since March 2026), [Pydantic AI](https://temporal.io/blog/build-durable-ai-agents-pydantic-ai-and-temporal), Vercel AI SDK, and Google ADK, the stack now covers the full agent lifecycle: durable orchestration, serverless compute, background jobs, and real-time streaming output.\n\nIf you are building AI agents that run longer than a few seconds and need to show progress to users, Workflow Streams is worth evaluating now. Full documentation is at [docs.temporal.io/develop/python/workflows/workflow-streams](https://docs.temporal.io/develop/python/workflows/workflow-streams). Public Preview means it is production-ready, with the standard caveat that the API may still change before GA.", "url": "https://wpnews.pro/news/temporal-workflow-streams-stream-ai-agent-output-in-real-time", "canonical_source": "https://byteiota.com/temporal-workflow-streams-stream-ai-agent-output-in-real-time/", "published_at": "2026-06-16 23:17:23+00:00", "updated_at": "2026-06-16 23:30:14.275715+00:00", "lang": "en", "topics": ["ai-agents", "developer-tools", "ai-infrastructure"], "entities": ["Temporal Technologies", "Temporal Workflow Streams", "Replay 2026", "Temporal Python SDK", "Redis", "SSE", "WebSocket"], "alternates": {"html": "https://wpnews.pro/news/temporal-workflow-streams-stream-ai-agent-output-in-real-time", "markdown": "https://wpnews.pro/news/temporal-workflow-streams-stream-ai-agent-output-in-real-time.md", "text": "https://wpnews.pro/news/temporal-workflow-streams-stream-ai-agent-output-in-real-time.txt", "jsonld": "https://wpnews.pro/news/temporal-workflow-streams-stream-ai-agent-output-in-real-time.jsonld"}}