{"slug": "we-built-a-custom-transport-for-vercel-s-ai-sdk", "title": "We built a Custom Transport for Vercel's AI SDK", "summary": "Ably built a custom transport for the Vercel AI SDK that replaces the default HTTP-based streaming transport with Ably's realtime pub/sub platform. The new transport enables multi-device and multi-user chat, resumable streams, human handoff, history compaction, and barge-in interruption — features the default SSE transport cannot support. This integration aims to solve open issues in the AI SDK, such as losing partial messages on stream errors and failing to resume streams mid-response.", "body_md": "Ably is a realtime messaging platform, it's a pub/sub product where you can publish messages to channels and clients subscribed to those channels will receive those messages in realtime.\n\nIt turns out that the Ably realtime platform is really well suited to being the transport that sits between your AI models and the clients receiving the generated responses.\n\nWe're trying to meet developers where they currently are, and one of those places is the Vercel AI SDK. So we built a custom transport for the Vercel AI SDK that uses Ably as the transport layer. We want to expose all the features the Ably AI Transport supports to the AI SDK; multi-device, multi-user, resumable streams, human handoff, history compaction, barge-in and interruption, and more.\n\nSo this post covers what we managed to support when building against the AI SDK. It was an exercise in trying to make a library do something it wasn't originally designed for.\n\n## AI SDK or AI UI SDK?\n\nSo the Vercel AI SDK comes in two flavors, the AI SDK to run on the server and the AI UI SDK to run on the client. The UI SDK provides a bunch of react hooks and is where we'd focus most of our efforts.\n\nThe main react hook that you need to know about is `useChat(...)`\n\n```\nconst { messages, sendMessage, status } = useChat({\n  transport: ablyChatTransport,\n});\n\nreturn (\n  <div>\n    {messages.map((m) => (\n      <div key={m.id}>{m.role}: {m.parts.map(p => p.text).join('')}</div>\n    ))}\n    <input onKeyDown={(e) => {\n      if (e.key === 'Enter') sendMessage({ prompt: e.currentTarget.value });\n    }} />\n  </div>\n);\n```\n\n`useChat`\n\nis the react hook that creates a chatbot interface that you'd expect from an AI assistant. It provides a 'messages' array that contains the messages in the conversation, and a 'sendMessage' function that you can use to send a message to the LLM.\n\n## The default transport over SSE\n\nThe default transport for the UI SDK is based on HTTP. The client makes an HTTP POST request carrying the user prompt and the conversation history. The client holds the connection open, waiting for an SSE response from the server containing the response tokens.\n\nHTTP is an obvious choice when the SDK was built by a team from Vercel; a serverless app platform based predominately on HTTP.\n\nHTTP streaming SSE is a simple and common design but it falls down when you try and add more advanced features, because:\n\n- It's not multi-device. If you have the chat open on your phone and your laptop, only one of those devices will receive the response.\n- It's not multi-user. If you have multiple users chatting with the same bot, they won't see each other's messages or responses.\n- It's not really resumable. SSE has\n`lastEventId`\n\nwhich technically supports resume, but that only works if your server stores the individual SSE events and can replay them on reconnect. Most don't in practice. And if the user refreshes the page, the connection is gone and there's no way to pick up where you left off. - Cancellation sucks. The HTTP SSE stream isn't bidirectional, so cancellation means closing the HTTP connection entirely. Even the SDK's own\n`stop()`\n\nfunction is broken. It fires the abort signal but, so buffered chunks keep arriving after you've supposedly stopped. There's also an__returns immediately without waiting for the stream to terminate__where__open issue__`stop()`\n\nreturns, the UI status stays`streaming`\n\n, and the server keeps generating tokens until completion. No barge-in or interruption support either. - There's no history, you need to build that separately.\n- There's no automatic compaction of tokens into full responses.\n\nThese are real problems that folks have encountered, the SDK has open issues for [ losing partial messages on stream errors](https://github.com/vercel/ai/issues/7562) and\n\n[.](https://github.com/vercel/ai/issues/13160)\n\n__failing to resume streams mid-response__These are all features that are fully supported by the Ably AI Transport, but aren't easily supported in HTTP based SSE responses.\n\nThe UI SDK exposes this transport using a `ChatTransport`\n\ninterface, with the methods:\n\n`sendMessages()`\n\n(send a prompt, return a stream of response chunks)`reconnectToStream()`\n\n(resume after disconnect),\n\nImplement these and you can swap out the default HTTP transport for anything.\n\n## useChat assumes one request and one response\n\nThe biggest issue we had when building the custom transport was that `useChat`\n\nwas designed around a single-request single-response flow. It assumes that for every message you send, you get one response back. This is a problem because the Ably AI Transport is designed to support multiple responses for a single message and multiple users participating in a single conversation.\n\n`useChat`\n\n's state machine expects a series of chunks in response to a single user prompt.\n\n```\nUser sends: \"What is pub/sub?\"\n\nuseChat reads these chunks from the stream returned by sendMessages():\n\n  { type: 'step-start' }\n  { type: 'text-start',  id: 'text-1' }\n  { type: 'text-delta',  id: 'text-1', delta: 'Pub/sub is ' }\n  { type: 'text-delta',  id: 'text-1', delta: 'a messaging pattern ' }\n  { type: 'text-delta',  id: 'text-1', delta: 'where publishers send...' }\n  { type: 'text-end',    id: 'text-1' }\n  { type: 'step-finish', finishReason: 'stop' }\n  { type: 'finish' }\n\nstatus:  ready → submitted → streaming → ready\n```\n\nEach chunk is either a control message like start or finish, or a content message like `text-delta`\n\ncontaining tokens from the LLM response.\n\nThe single request single response assumption in `useChat`\n\nis obviously an issue if you want to support multiple users in the same conversation, because only one of those users has sent the prompt, but the prompt and response should be fanned out to all the users in the conversation.\n\nInternally, `useChat`\n\ntracks one `activeResponse`\n\nat a time. If two messages are sent concurrently, the second [ overwrites the first](https://github.com/vercel/ai/issues/11693), the\n\n`onFinish`\n\nlifecycle hook fires once instead of twice, and you can end up [. The community has](https://github.com/vercel/ai/issues/11024)\n\n__crashing on undefined state__[but there's no support for it yet.](https://github.com/vercel/ai/discussions/5139)\n\n__asked for multi-message streaming__## useChat's setMessages(...) backdoor\n\nSharing the conversation state between multiple users is easy over Ably channels, but updating that state in `useChat`\n\nis hard because of the single request single response design.\n\nBut `useChat`\n\nhas a secret weapon, a `setMessages(...)`\n\nfunction that you can use to set the messages state directly. This is a backdoor that allows you to bypass the state machine and set the conversation state to whatever you want.\n\nThis is what we ended up doing, we used `setMessages(...)`\n\nto set the message state directly, with the full conversation no matter which user sent the prompt. This allowed us to support multi-user conversations.\n\nThe problem with this approach is that `setMessages(...)`\n\ncompletely bypasses the state machine, which immediately breaks a lot of the built-in features of `useChat`\n\nlike lifecycle hooks and tool-call notifications.\n\n## Building around the limitations\n\nSometimes you just have to do the best with what you've got, and what we've got is a square peg and a round hole. `useChat`\n\nwas never designed to support the kinds of features we're trying to add to it. So we built around the limitations by tracking 'own-turns' (i.e. prompts submitted by this client, and the LLM response to that prompt) and 'observer-turns' (i.e. prompts submitted by other clients, and the response).\n\nOwn-turns can trigger the full lifecycle, they go through the regular `sendMessages(...)`\n\nflow in `useChat`\n\nand process lifecycle hooks and tool-calls as normal.\n\nObserver-turns are set directly with `setMessages(...)`\n\nand bypass the lifecycle hooks and tool-call notifications, but at least they show up in the conversation for all users. We also have to temporarily buffer observer-turns if there's currently an own-turn in progress, because the state machine doesn't support interleaving messages from multiple responses.\n\n## So what extra can you do with useChat and the Ably AI Transport?\n\nActually quite a lot, the Ably AI Transport can add these features to `useChat`\n\n:\n\n- Multi-device with automatic fan out\n- Multi-user conversations, with more than one user submitting prompts and receiving responses in the same conversation.\n- Resumable streams, if you lose your connection you can reconnect and receive the rest of the response automatically.\n- Human handoff, you can have a human take over the conversation at any time and respond to the user prompts because we already have multi-user support.\n- Interruptions, cancellation, barge-in, you can interrupt or steer the LLM conversation at any time by sending a new prompt, even if the previous response hasn't finished yet. This is possible because the Ably AI Transport uses channels, and channels are a bi-directional streaming layer.\n- History compaction, the tokens from LLM responses are automatically compacted together into a single message in the conversation history, so live clients receive tokens progressively in realtime, but new clients joining the conversation later receive the full response in one message.\n\nThere's a whole bunch more to the Ably AI Transport than we have talked about here. But even with the limitations of `useChat`\n\n, just changing from an HTTP transport to the Ably AI Transport you can unlock a whole bunch of extra features like multi-device, multi-user, resumable streams, human handoff, interruptions, and history compaction.\n\nIf you don't want to be constrained by `useChat`\n\nat all, the Ably AI Transport SDK also provides `useClientTransport`\n\nand `useView`\n\n— react hooks that give you direct access to the transport and the conversation tree without going through `useChat`\n\n's state machine. You still get the Vercel AI SDK's stream format, but you're not fighting the single-response assumptions. Check out our [ AI Transport SDK](https://github.com/ably/ably-ai-transport-js) if you're interested.", "url": "https://wpnews.pro/news/we-built-a-custom-transport-for-vercel-s-ai-sdk", "canonical_source": "https://ably.com/blog/custom-transport-vercel-ai-sdk", "published_at": "2026-05-13 11:22:13+00:00", "updated_at": "2026-05-29 16:38:34.884982+00:00", "lang": "en", "topics": ["ai-tools", "ai-infrastructure", "ai-products", "generative-ai", "large-language-models"], "entities": ["Ably", "Vercel AI SDK", "Vercel", "AI SDK", "AI UI SDK"], "alternates": {"html": "https://wpnews.pro/news/we-built-a-custom-transport-for-vercel-s-ai-sdk", "markdown": "https://wpnews.pro/news/we-built-a-custom-transport-for-vercel-s-ai-sdk.md", "text": "https://wpnews.pro/news/we-built-a-custom-transport-for-vercel-s-ai-sdk.txt", "jsonld": "https://wpnews.pro/news/we-built-a-custom-transport-for-vercel-s-ai-sdk.jsonld"}}