How to Build an AI Chat Endpoint in Node.js with the Telnyx AI Assistants API

Telnyx published a code example showing how to build a production-ready AI chat endpoint in Node.js using the Telnyx AI Assistants API. The 80-line Express app exposes a /chat endpoint that sends messages to a configured AI assistant and returns structured JSON responses, with proper error mapping and input validation. The example demonstrates Telnyx's AI Communications Infrastructure pattern, where voice, messaging, and AI are unified under a single SDK.

Most "add an AI assistant to my app" tutorials stop at the demo. They show you how to call an LLM and print the response, then leave the production plumbing — error mapping, input validation, retry handling, observability — as an exercise for the reader. This example takes the opposite path: a small Express app that does one thing well, exposes it as a clean HTTP endpoint, and maps Telnyx SDK errors to the right HTTP status codes from the first request. The full code is in the telnyx-code-examples https://github.com/team-telnyx/telnyx-code-examples/tree/main/chat-with-ai-assistant-nodejs repository under chat-with-ai-assistant-nodejs . It is roughly 80 lines of JavaScript including imports, comments, and a /health check. The app exposes one chat endpoint and one health endpoint: POST /chat — sends a message to a Telnyx AI Assistant and returns its response GET /health — liveness checkThe AI Assistant is selected by the AI ASSISTANT ID environment variable, not the request body. This is intentional: the assistant configuration model, system prompt, tools, knowledge base lives in the Telnyx Portal, and every request to this endpoint routes to that one assistant. If you want per-tenant assistants, the change is a one-line lookup. The response is plain JSON: { "assistant id": "assistant-1234abcd", "user message": "What are your business hours?", "assistant response": "We are open Monday to Friday, 9am to 5pm.", "timestamp": "2026-06-18T14:32:00.000Z" } There is no streaming, no conversation history, and no client-side state. Every request is self-contained. POST /chat { "message": "..." } │ ▼ ┌──────────────────────┐ │ Express server.js │ │ chatWithAssistant │ └──────────┬───────────┘ │ client.ai.assistants.chat assistantId, {messages} ▼ ┌──────────────────────┐ │ Telnyx AI Assistant │ └──────────┬───────────┘ │ └──► assistant response JSON The whole thing is one SDK call. The Telnyx AI Assistants service handles model selection, conversation state if you ask for it , tool calling, knowledge base retrieval, and telephony integration if you later wire the same assistant to a phone number. Your Node.js app stays focused on input validation, error mapping, and response shaping. This is the AI Communications Infrastructure pattern in miniature: voice, messaging, and AI on one private network, exposed through one SDK. You do not have to stitch together a separate LLM provider, a vector database, a tool-calling framework, and a webhook gateway. Telnyx already runs them. git clone https://github.com/team-telnyx/telnyx-code-examples.git cd telnyx-code-examples/chat-with-ai-assistant-nodejs cp .env.example .env fill in TELNYX API KEY and AI ASSISTANT ID npm install node server.js starts on http://localhost:5000 You need two values in .env : TELNYX API KEY — your Telnyx API v2 key from the AI ASSISTANT ID — the ID of an AI Assistant you have already created or use the Portal's no-code assistant builder Test it with a single curl: curl -X POST http://localhost:5000/chat \ -H "Content-Type: application/json" \ -d '{"message": "What can you help with?"}' You should get back a JSON object with assistant response populated. The interesting code is chatWithAssistant assistantId, message . It is small enough to read in one screen and deliberate about every line: async function chatWithAssistant assistantId, message { if assistantId { throw new Error "AI ASSISTANT ID environment variable not set" ; } if message || message.trim .length === 0 { throw new Error "Message cannot be empty" ; } const response = await client.ai.assistants.chat assistantId, { messages: { role: "user", content: message, }, , } ; // Extract serializable data — SDK objects are NOT JSON-serializable return { assistant id: assistantId, user message: message, assistant response: response.content, timestamp: new Date .toISOString , }; } Three things to notice: messages: {role: "user", content: "..."} is the format every modern LLM SDK uses. If you ever migrate off Telnyx AI Assistants to direct chat completions, the request body does not change. JSON.stringify response directly throws. Pull out only the fields you need response.content and return a plain object.The route handler maps every Telnyx SDK error class to a meaningful HTTP status: if error instanceof Telnyx.AuthenticationError { return res.status 401 .json { error: "Invalid API key" } ; } if error instanceof Telnyx.RateLimitError { return res.status 429 .json { error: "Rate limit exceeded. Please slow down." } ; } if error instanceof Telnyx.APIConnectionError { return res.status 503 .json { error: "Network error connecting to Telnyx" } ; } if error instanceof Telnyx.APIError { return res.status error.status || 500 .json { error: error.message, status: error.status, } ; } This matters because clients of your endpoint frontend, mobile app, another service need to know whether to retry, refresh credentials, back off, or give up. A flat "500 Internal Server Error" forces them to guess. A precise 401 tells them to re-authenticate. A 429 tells them to slow down. The handler also covers two non-Telnyx error paths: error.message.includes "environment variable" → 500 with the literal messageThe validation errors from chatWithAssistant missing assistant ID, empty message flow through this last branch. Returning them as 400 instead of 500 is the right call — these are client mistakes, not server failures. Most AI chat wrappers become complicated because each capability lives in a different service. You manage conversation history in one database, call a model provider for completions, wire a vector store for retrieval, and bolt on a tool-calling framework to handle function calls. Each integration adds latency, failure modes, and credentials to manage. A Telnyx AI Assistant bundles those pieces. The Assistant you configure in the Portal — or programmatically through create-ai-assistant-nodejs — already has a model, an optional knowledge base, optional tools, and optional system prompt attached. When you call client.ai.assistants.chat assistantId, messages , you are talking to that entire stack through one SDK call. This means the Node.js app can stay small. It validates input, calls the SDK, maps errors, and returns JSON. There is no SDK-of-SDKs to keep in sync. The example is intentionally minimal. Before you ship it to production, consider: Authentication on the Express side. Right now anyone who can reach /chat can talk to your assistant. Add a JWT check, an API key header, or a session lookup depending on who calls this endpoint. The assistant itself does not authenticate callers — it trusts whatever your code sends it. Rate limiting. Telnyx enforces per-account rate limits and returns 429 when you cross them. Add per-IP or per-user rate limiting in front of this endpoint if you expose it to untrusted clients, otherwise one bad actor can exhaust your quota. Conversation history. The example is stateless — every request is a single message with no prior context. If you want a multi-turn chat, pass the full message array with assistant role turns you have stored on each request. The Assistant will use that as the conversation context. Observability. Log the assistant ID, message length, response length, latency, and any error class. Those five fields tell you 95% of what you need to debug a chat endpoint in production. Streaming. The example returns the full response in one shot. Telnyx AI Assistants support streaming responses for lower time-to-first-token — wrap the SDK call in a stream and pipe chunks to the client. Useful for chat UIs where perceived latency matters more than total latency. Keep-alive on the SDK client. The example creates the Telnyx client once at module load and reuses it. Do not re-instantiate per request — that defeats HTTP keep-alive and adds tens of milliseconds of TLS overhead per call. The full example is open source: https://github.com/team-telnyx/telnyx-code-examples/tree/main/chat-with-ai-assistant-nodejs https://github.com/team-telnyx/telnyx-code-examples/tree/main/chat-with-ai-assistant-nodejs Useful docs: If you want to extend this pattern, the same repo has: Telnyx is an AI Communications Infrastructure platform — voice, messaging, SIP, AI, and IoT on one private, global network. AI Assistants run on that same network, so you can pair conversational AI with telephony and messaging through a single API and SDK instead of stitching together multiple vendors.