Most "add an AI assistant to my app" tutorials stop at the demo. They show you how to call an LLM and print the response, then leave the production plumbing β error mapping, input validation, retry handling, observability β as an exercise for the reader. This example takes the opposite path: a small Express app that does one thing well, exposes it as a clean HTTP endpoint, and maps Telnyx SDK errors to the right HTTP status codes from the first request.
The full code is in the telnyx-code-examples repository under chat-with-ai-assistant-nodejs
. It is roughly 80 lines of JavaScript including imports, comments, and a /health
check.
The app exposes one chat endpoint and one health endpoint:
POST /chat
β sends a message to a Telnyx AI Assistant and returns its responseGET /health
β liveness checkThe AI Assistant is selected by the AI_ASSISTANT_ID
environment variable, not the request body. This is intentional: the assistant configuration (model, system prompt, tools, knowledge base) lives in the Telnyx Portal, and every request to this endpoint routes to that one assistant. If you want per-tenant assistants, the change is a one-line lookup.
The response is plain JSON:
{
"assistant_id": "assistant-1234abcd",
"user_message": "What are your business hours?",
"assistant_response": "We are open Monday to Friday, 9am to 5pm.",
"timestamp": "2026-06-18T14:32:00.000Z"
}
There is no streaming, no conversation history, and no client-side state. Every request is self-contained.
POST /chat { "message": "..." }
β
βΌ
ββββββββββββββββββββββββ
β Express (server.js) β
β chatWithAssistant() β
ββββββββββββ¬ββββββββββββ
β client.ai.assistants.chat(assistantId, {messages})
βΌ
ββββββββββββββββββββββββ
β Telnyx AI Assistant β
ββββββββββββ¬ββββββββββββ
β
ββββΊ assistant_response (JSON)
The whole thing is one SDK call. The Telnyx AI Assistants service handles model selection, conversation state (if you ask for it), tool calling, knowledge base retrieval, and telephony integration if you later wire the same assistant to a phone number. Your Node.js app stays focused on input validation, error mapping, and response shaping.
This is the AI Communications Infrastructure pattern in miniature: voice, messaging, and AI on one private network, exposed through one SDK. You do not have to stitch together a separate LLM provider, a vector database, a tool-calling framework, and a webhook gateway. Telnyx already runs them.
git clone https://github.com/team-telnyx/telnyx-code-examples.git
cd telnyx-code-examples/chat-with-ai-assistant-nodejs
cp .env.example .env # fill in TELNYX_API_KEY and AI_ASSISTANT_ID
npm install
node server.js # starts on http://localhost:5000
You need two values in .env
:
TELNYX_API_KEY
β your Telnyx API v2 key from the AI_ASSISTANT_ID
β the ID of an AI Assistant you have already created (or use the Portal's no-code assistant builder)Test it with a single curl:
curl -X POST http://localhost:5000/chat \
-H "Content-Type: application/json" \
-d '{"message": "What can you help with?"}'
You should get back a JSON object with assistant_response
populated.
The interesting code is chatWithAssistant(assistantId, message)
. It is small enough to read in one screen and deliberate about every line:
async function chatWithAssistant(assistantId, message) {
if (!assistantId) {
throw new Error("AI_ASSISTANT_ID environment variable not set");
}
if (!message || message.trim().length === 0) {
throw new Error("Message cannot be empty");
}
const response = await client.ai.assistants.chat(assistantId, {
messages: [
{
role: "user",
content: message,
},
],
});
// Extract serializable data β SDK objects are NOT JSON-serializable
return {
assistant_id: assistantId,
user_message: message,
assistant_response: response.content,
timestamp: new Date().toISOString(),
};
}
Three things to notice:
messages: [{role: "user", content: "..."}]
is the format every modern LLM SDK uses. If you ever migrate off Telnyx AI Assistants to direct chat completions, the request body does not change.JSON.stringify(response)
directly throws. Pull out only the fields you need (response.content
) and return a plain object.The route handler maps every Telnyx SDK error class to a meaningful HTTP status:
if (error instanceof Telnyx.AuthenticationError) {
return res.status(401).json({ error: "Invalid API key" });
}
if (error instanceof Telnyx.RateLimitError) {
return res.status(429).json({ error: "Rate limit exceeded. Please slow down." });
}
if (error instanceof Telnyx.APIConnectionError) {
return res.status(503).json({ error: "Network error connecting to Telnyx" });
}
if (error instanceof Telnyx.APIError) {
return res.status(error.status || 500).json({
error: error.message,
status: error.status,
});
}
This matters because clients of your endpoint (frontend, mobile app, another service) need to know whether to retry, refresh credentials, back off, or give up. A flat "500 Internal Server Error" forces them to guess. A precise 401
tells them to re-authenticate. A 429
tells them to slow down.
The handler also covers two non-Telnyx error paths:
error.message.includes("environment variable")
β 500 with the literal messageThe validation errors from chatWithAssistant()
(missing assistant ID, empty message) flow through this last branch. Returning them as 400 instead of 500 is the right call β these are client mistakes, not server failures.
Most AI chat wrappers become complicated because each capability lives in a different service. You manage conversation history in one database, call a model provider for completions, wire a vector store for retrieval, and bolt on a tool-calling framework to handle function calls. Each integration adds latency, failure modes, and credentials to manage.
A Telnyx AI Assistant bundles those pieces. The Assistant you configure in the Portal β or programmatically through create-ai-assistant-nodejs
β already has a model, an optional knowledge base, optional tools, and optional system prompt attached. When you call client.ai.assistants.chat(assistantId, messages)
, you are talking to that entire stack through one SDK call.
This means the Node.js app can stay small. It validates input, calls the SDK, maps errors, and returns JSON. There is no SDK-of-SDKs to keep in sync.
The example is intentionally minimal. Before you ship it to production, consider:
Authentication on the Express side. Right now anyone who can reach /chat
can talk to your assistant. Add a JWT check, an API key header, or a session lookup depending on who calls this endpoint. The assistant itself does not authenticate callers β it trusts whatever your code sends it.
Rate limiting. Telnyx enforces per-account rate limits and returns 429
when you cross them. Add per-IP or per-user rate limiting in front of this endpoint if you expose it to untrusted clients, otherwise one bad actor can exhaust your quota.
Conversation history. The example is stateless β every request is a single message with no prior context. If you want a multi-turn chat, pass the full message array (with assistant
role turns you have stored) on each request. The Assistant will use that as the conversation context.
Observability. Log the assistant ID, message length, response length, latency, and any error class. Those five fields tell you 95% of what you need to debug a chat endpoint in production.
Streaming. The example returns the full response in one shot. Telnyx AI Assistants support streaming responses for lower time-to-first-token β wrap the SDK call in a stream and pipe chunks to the client. Useful for chat UIs where perceived latency matters more than total latency.
Keep-alive on the SDK client. The example creates the Telnyx
client once at module load and reuses it. Do not re-instantiate per request β that defeats HTTP keep-alive and adds tens of milliseconds of TLS overhead per call.
The full example is open source:
https://github.com/team-telnyx/telnyx-code-examples/tree/main/chat-with-ai-assistant-nodejs
Useful docs:
If you want to extend this pattern, the same repo has:
Telnyx is an AI Communications Infrastructure platform β voice, messaging, SIP, AI, and IoT on one private, global network. AI Assistants run on that same network, so you can pair conversational AI with telephony and messaging through a single API and SDK instead of stitching together multiple vendors.