Messaging in the Age of AI A developer has identified that AI agent systems fundamentally break the assumptions of traditional messaging infrastructure, requiring a shift from delivering reliable payloads to managing reasoning context at scale. Messages from AI agents can carry 100K-token context windows, generate bursty traffic patterns uncorrelated with user activity, and lose inherent idempotency as the same logical input can produce different reasoning paths on each retry. The analysis, based on production experience with Spring Boot and Apache Kafka, catalogs new workloads including planning outputs, tool-call results, chain-of-thought traces, and multi-agent consensus broadcasts that stress messaging systems in ways traditional deterministic services never required. Messaging infrastructure has been boring for a decade. Queues, topics, exchanges — the primitives settled. Then AI agents arrived, and suddenly the assumptions that made messaging boring stopped holding. Messages are no longer just data. They are context. An agent will read your message, reason over it, call tools because of it, and generate responses whose token count you cannot predict at enqueue time. The transport layer that worked fine for deterministic services needs to be rethought — not replaced, but adapted. This article is not about which message broker to pick. It is about what changes when the producer and consumer are both potentially non-deterministic reasoning systems, and what patterns actually hold up in production. The examples use Spring Boot and Apache Kafka because that is a stack I have seen work at scale, but the patterns apply across stacks. Traditional messaging carries structured, bounded payloads. An order-placed event has a known shape: order ID, customer ID, line items, total. A payment-confirmed event carries a transaction reference. These messages are small hundreds of bytes , predictable in volume, and idempotent by design — reprocess the same order event, get the same result. AI-originated messages break all three assumptions. A single agent-to-agent message can carry a 100K-token context window — effectively a small novel's worth of reasoning state. Volume is bursty in ways that do not correlate with user activity: a multi-agent consensus round can generate 50 internal messages for a single user request. And idempotency is no longer free, because the same logical input can produce different reasoning paths on each retry. The key consideration here is that messaging for AI systems shifts from "deliver this payload reliably" to "manage reasoning context at scale." Reliability still matters — it matters more — but it is joined by concerns that traditional messaging never had to address: token budgets, model latency variance, and reasoning trace integrity. In the traditional model, each arrow is a bounded, schema-validated message. In the AI model, the arrow from Planner to Executor carries an entire reasoning state — and that arrow has a dollar cost measured in tokens. The messaging layer needs to know that. Agents generate traffic patterns that look nothing like what your messaging infrastructure was designed for. It is worth cataloguing the new workloads explicitly, because each one stresses a different part of the system. Planning outputs. Before an agent acts, it thinks — and the thinking produces structured output. A planner agent emits a plan object goal, sub-goals, constraints, assigned agents that downstream agents consume. These messages are medium-sized 2-8K tokens and are the highest-leverage messages in the system — get the plan wrong, and everything downstream wastes tokens. Tool-call results. When an agent invokes a tool — a database query, an API call, a code execution — the result enters the messaging fabric as a first-class message. These are unpredictable in size a SQL query can return one row or a million and must be chunked, summarized, or rejected before they blow out a context window. Chain-of-thought traces. Some architectures persist the agent's reasoning trace as it streams — not just for debugging, but as context shared with other agents. A reasoning trace is verbose by design. Storing and forwarding it as a message requires treating it as a structured artifact, not a log line. Multi-agent broadcast and consensus. Agents often need to reach agreement — which plan to execute, whether a tool call result is valid, whether a response meets policy. These consensus rounds generate fan-out message bursts: one agent publishes a proposal, N agents respond with votes or critiques. The messaging layer sees N+1 messages where a traditional system would see one. In practice, this means your messaging system needs to handle message sizes spanning five orders of magnitude bytes to megabytes , traffic bursts that do not follow any daily or weekly pattern, and consumers that may take seconds or minutes to process a single message — and retry it aggressively if they are unsure of the result. After observing agent systems in production across several teams, a set of patterns has crystallized. These are not speculative. They are what teams end up building after the first production incident. Every message in an AI system must carry metadata beyond a correlation ID. The envelope should include the token count of the payload, the model that generated it, the trace ID, the sender type human, agent, tool , and an idempotency key if the sender is an agent. The consumer uses this metadata to make routing, quota, and deduplication decisions without parsing the payload body. The companion project implements this as a Java record — see code/src/main/java/com/messaging/relay/model/MessageEnvelope.java : public record MessageEnvelope