# Building a Streaming AI Chat App with Next.js 16 + Claude API — Complete App Router Guide

> Source: <https://dev.to/jangwook_kim_e31e7291ad98/building-a-streaming-ai-chat-app-with-nextjs-16-claude-api-complete-app-router-guide-594n>
> Published: 2026-05-20 06:40:22+00:00

Search for "Next.js AI chat" in 2026 and Vercel AI SDK still comes up as the de facto standard. Nothing wrong with it, but relying on the SDK means you often don't understand what's happening under the hood — how streaming actually works, what the Route Handler is doing behind the scenes.

I built this from scratch in a sandbox using only the Anthropic SDK. The entire flow: `create-next-app`

, adding `@anthropic-ai/sdk`

, implementing the Route Handler. It turned out simpler than I expected. About 50 lines gets you a production-deployable streaming chat backend.

One thing I noticed while doing this: `create-next-app@latest`

now installs **Next.js 16**. Most tutorials you'll find are still targeting Next.js 14 or 15. This post reflects what you actually get in May 2026.

## What We're Building

The app structure:

-
**Next.js 16.2.6**+ App Router - Route Handler (
`/api/chat`

) calling Claude API server-side - SSE (Server-Sent Events) delivering streaming responses to the client
- React 19
`"use client"`

component rendering the stream in real time

The key point: **the API key is read only on the server and never included in the client bundle.** This is a direct consequence of how Next.js App Router separates server and client code.

Actual build output from the sandbox:

```
▲ Next.js 16.2.6 (Turbopack)
✓ Compiled successfully in 1874ms

Route (app)
┌ ○ /           (Static)  prerendered as static content
└ ƒ /api/chat   (Dynamic) server-rendered on demand
```

## Project Structure

When finished, the structure looks like this. Two files are the core; the rest is generated by `create-next-app`

.

```
nextjs-claude-chat/
├── src/
│   └── app/
│       ├── api/
│       │   └── chat/
│       │       └── route.ts    ← Claude API streaming endpoint (core)
│       ├── page.tsx             ← Chat UI (core)
│       ├── layout.tsx           ← Auto-generated
│       └── globals.css          ← Auto-generated
├── .env.local                   ← ANTHROPIC_API_KEY goes here
├── package.json
└── tsconfig.json
```

Two files. `route.ts`

is server code; `page.tsx`

is client code. `api/chat/route.ts`

ends up in the server bundle only, while `page.tsx`

with its `"use client"`

directive goes to the client bundle. This separation is what makes API key security work.

## Prerequisites

- Node.js 18+
- Anthropic API key (
`sk-ant-...`

) — get one at[console.anthropic.com](https://console.anthropic.com) - Basic TypeScript knowledge
- Basic Next.js App Router understanding (you can follow along without it)

## Step 1: Create the Project and Install Dependencies

```
npx create-next-app@latest nextjs-claude-chat \
  --typescript \
  --tailwind \
  --eslint \
  --app \
  --src-dir \
  --import-alias "@/*"

cd nextjs-claude-chat
npm install @anthropic-ai/sdk
```

As of May 2026, `create-next-app@latest`

installs **Next.js 16.2.6** with React 19.2.4. Existing tutorials using Next.js 14/15 may have minor differences.

Key dependencies after installation:

```
{
  "dependencies": {
    "@anthropic-ai/sdk": "^0.97.1",
    "next": "16.2.6",
    "react": "19.2.4",
    "react-dom": "19.2.4"
  }
}
```

Anthropic SDK 0.97.x is the current latest. Earlier versions (0.20.x and below) had a different `messages.stream()`

API, so pin your version if you're migrating.

## Step 2: Implement the Claude API Route Handler

The core file. Create `src/app/api/chat/route.ts`

:

``` python
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = await client.messages.stream({
    model: "claude-opus-4-7",
    max_tokens: 1024,
    messages,
  });

  const encoder = new TextEncoder();
  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        if (
          chunk.type === "content_block_delta" &&
          chunk.delta.type === "text_delta"
        ) {
          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({ text: chunk.delta.text })}\n\n`
            )
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]\n\n"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}
```

Two things worth noting here.

**First, client.messages.stream() returns an AsyncIterableStream.** The

`for await...of`

loop receives chunks one at a time and pushes them to the client. When the stream ends, a `[DONE]`

signal is sent and the controller closes.**Second, ReadableStream + TextEncoder is Web Streams API standard.** Next.js Route Handlers use Web Streams, not Node.js

`stream`

module. This is why the code looks different from [FastAPI streaming](https://dev.to/en/blog/en/fastapi-claude-api-streaming-production-guide-2026)or Express implementations.

`new ReadableStream`

may feel unfamiliar, but it's the standard across modern JavaScript runtimes — Cloudflare Workers, Deno, Bun all work the same way.The filter on `content_block_delta`

events: Anthropic's streaming protocol emits multiple event types (`message_start`

, `content_block_start`

, `content_block_delta`

, `message_delta`

, `message_stop`

). Only `text_delta`

typed `content_block_delta`

events carry actual text content.

## Step 3: Environment Variables and Security

Create `.env.local`

in the project root (same level as `.next/`

):

```
ANTHROPIC_API_KEY=sk-ant-your-actual-key-here
```

**Never use the NEXT_PUBLIC_ prefix.** This is core to Next.js security.

| Variable format | Accessible from | Use for |
|---|---|---|
`ANTHROPIC_API_KEY` |
Server only (Route Handler, Server Component) | ✓ API keys |
`NEXT_PUBLIC_API_KEY` |
Client-public (included in browser bundle) | ✗ Never use for API keys |

`NEXT_PUBLIC_`

variables get inlined into the JavaScript bundle at build time. Anyone can see them in browser DevTools. Without the prefix, the variable is server-only — referencing it from client code will cause a build error.

## Step 4: Client Chat UI

Create `src/app/page.tsx`

with streaming state management:

``` js
"use client";

import { useState, useRef, useEffect } from "react";

type Message = {
  role: "user" | "assistant";
  content: string;
};

export default function ChatPage() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState("");
  const [isLoading, setIsLoading] = useState(false);
  const bottomRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    bottomRef.current?.scrollIntoView({ behavior: "smooth" });
  }, [messages]);

  const sendMessage = async () => {
    if (!input.trim() || isLoading) return;

    const userMessage: Message = { role: "user", content: input };
    const updatedMessages = [...messages, userMessage];
    setMessages(updatedMessages);
    setInput("");
    setIsLoading(true);

    // Add placeholder for the assistant response
    const assistantMessage: Message = { role: "assistant", content: "" };
    setMessages([...updatedMessages, assistantMessage]);

    const res = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ messages: updatedMessages }),
    });

    if (!res.body) return;

    const reader = res.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value);
      const lines = chunk.split("\n");

      for (const line of lines) {
        if (line.startsWith("data: ") && line !== "data: [DONE]") {
          const data = JSON.parse(line.slice(6));
          setMessages((prev) => {
            const last = prev[prev.length - 1];
            return [
              ...prev.slice(0, -1),
              { ...last, content: last.content + data.text },
            ];
          });
        }
      }
    }

    setIsLoading(false);
  };

  return (
    <main className="flex flex-col h-screen max-w-2xl mx-auto p-4">
      <h1 className="text-2xl font-bold mb-4">Claude Chat</h1>
      <div className="flex-1 overflow-y-auto space-y-4 mb-4">
        {messages.map((msg, i) => (
          <div key={i} className={`p-3 rounded-lg ${
            msg.role === "user"
              ? "bg-blue-100 ml-auto max-w-xs"
              : "bg-gray-100 mr-auto max-w-md"
          }`}>
            <span className="text-xs text-gray-500 block mb-1">
              {msg.role === "user" ? "You" : "Claude"}
            </span>
            <p className="whitespace-pre-wrap">{msg.content}</p>
          </div>
        ))}
        <div ref={bottomRef} />
      </div>
      <div className="flex gap-2">
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          onKeyDown={(e) => e.key === "Enter" && sendMessage()}
          placeholder="Type a message..."
          className="flex-1 border rounded-lg px-3 py-2 focus:outline-none focus:ring-2 focus:ring-blue-400"
        />
        <button
          onClick={sendMessage}
          disabled={isLoading || !input.trim()}
          className="bg-blue-500 text-white px-4 py-2 rounded-lg disabled:opacity-50"
        >
          Send
        </button>
      </div>
    </main>
  );
}
```

Two implementation details to highlight. First, the `line !== "data: [DONE]"`

check: without it, the loop tries to parse `[DONE]`

as JSON and throws an error. Second, the functional `setMessages((prev) => ...)`

update: inside an async loop, closures capture stale state. Using `prev`

ensures you're always appending to the latest message content.

## Step 5: Build and Run

```
npm run dev
# ▲ Next.js 16.2.6 (Turbopack)
# ✓ Ready in 337ms
# Local: http://localhost:3000

npm run build
# ✓ Compiled successfully in 1874ms
# ƒ /api/chat  (Dynamic)
```

337ms dev server startup is noticeably faster than Webpack-based builds. The production build also runs TypeScript checks automatically — type errors fail the build, which is the right behavior for a typed codebase.

## Limitations of This Implementation

I'll be straight: don't ship this to production as-is.

**No error handling.** When the Claude API fails — rate limit, network error, invalid key — the stream just drops. The user sees nothing. Real services need `try/catch`

blocks and error SSE events:

```
// Route Handler with error handling
export async function POST(req: Request) {
  try {
    const { messages } = await req.json();
    const stream = await client.messages.stream({ /* ... */ });
    const readable = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of stream) { /* ... */ }
        } catch (streamError) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ error: "Stream error" })}\n\n`)
          );
        } finally {
          controller.enqueue(encoder.encode("data: [DONE]\n\n"));
          controller.close();
        }
      },
    });
    return new Response(readable, { /* headers */ });
  } catch (err) {
    return new Response(JSON.stringify({ error: "Request failed" }), {
      status: 500,
      headers: { "Content-Type": "application/json" },
    });
  }
}
```

**No conversation length limit.** The full message history goes to the API on every request. Long conversations will eventually exceed the context window or drive up costs. Production apps need to trim to the last N messages or manage token counts.

**No concurrent request management.** Rapid messages or multiple tabs cause streaming collisions. AbortController logic for canceling previous requests is missing.

## Deploying to Vercel

A few things I hit when deploying:

**Environment variables**: Add `ANTHROPIC_API_KEY`

in Vercel's Project Settings → Environment Variables. The `.env.local`

file stays local.

**Runtime**: Explicitly set Node.js Runtime in your Route Handler to avoid Edge Runtime compatibility issues:

``` js
export const runtime = 'nodejs';
```

**Function timeout**: Vercel Hobby plan has a 10-second limit. For longer responses, add this to `vercel.json`

:

```
{
  "functions": {
    "src/app/api/chat/route.ts": {
      "maxDuration": 60
    }
  }
}
```

## How SSE Works Under the Hood

Server-Sent Events is a one-way streaming protocol running over plain HTTP. Unlike WebSockets, it passes through proxies, CDNs, and firewalls with no special handling.

SSE message format:

```
data: {"text": "Hello"}\n\n
data: {"text": ", World"}\n\n
data: [DONE]\n\n
```

Each message starts with `data:`

and ends with two newlines. The `TextEncoder`

/`TextDecoder`

pair converts between strings and `Uint8Array`

(bytes) — Web Streams API operates at the byte level. This same pattern works across Next.js, Cloudflare Workers, Deno, and Bun.

## Raw API vs. Vercel AI SDK

| Aspect | Raw Anthropic SDK | Vercel AI SDK |
|---|---|---|
| Code volume | More (~50-line Route Handler) | Less (`useChat` one-liner) |
| Customization | Completely free | Within SDK abstractions |
| Debuggability | SSE flow is transparent | Internal logic is opaque |
| Learning value | Forces you to understand Web Streams and SSE | Use immediately |
| Best for | Understanding streaming mechanics | Rapid prototyping |

My recommendation: build it the raw way once, then use the SDK. You'll understand what the SDK is actually handling for you. See [Building a Claude Streaming Agent with Vercel AI SDK](https://dev.to/en/blog/en/vercel-ai-sdk-claude-streaming-agent-2026) for the SDK-based comparison.

## Next Steps

-
**Add Tool Use**— Give Claude function-calling ability →[Claude Agent SDK Complete Guide](https://dev.to/en/blog/en/claude-agent-sdk-tool-use-complete-guide-2026) -
**Prompt Caching**— Cut API costs up to 90% →[Claude API Prompt Caching in Practice](https://dev.to/en/blog/en/claude-api-prompt-caching-cost-optimization-guide) -
**Stronger error handling**— AbortController, retry logic, error SSE events -
**Stream cancellation**— Cancel button to stop generation mid-stream -
**Vercel deployment**— Apply the notes above and go live

A deeper guide covering those production gaps is coming in a follow-up post.
