Build a Simple RAG App with Telnyx AI Inference

Telnyx released a tutorial for building a retrieval-augmented generation (RAG) API using its AI Inference and Embeddings services. The Python/Flask app embeds user questions, retrieves relevant documents, and generates grounded answers via chat completions. It aims to help developers integrate RAG with Telnyx's AI infrastructure.

| name | build-rag-with-telnyx-inference | | |---|---|---| | title | Build RAG with Telnyx Inference | | | description | Build a retrieval-augmented generation API with Telnyx embeddings and chat completions. | | | language | python | | | framework | flask | | | telnyx products | | Build a retrieval-augmented generation API with Telnyx embeddings and chat completions. Embeddings : POST /v2/ai/embeddings - create vectors for documents and questions AI Inference : POST /v2/ai/chat/completions - API reference https://developers.telnyx.com/api/inference/chat-completions User question | v Embed question with Telnyx | v Compare against document embeddings | v Send retrieved context to Telnyx AI | v Grounded answer + source titles Copy .env.example to .env and fill in: | Variable | Type | Example | Required | Description | Where to get it | |---|---|---|---|---|---| TELNYX API KEY | string | KEY0123456789ABCDEF | yes | Telnyx API v2 key | | AI MODEL string moonshotai/Kimi-K2.6 Models https://developers.telnyx.com/docs/inference/models EMBEDDING MODEL string thenlper/gte-large Models https://developers.telnyx.com/docs/inference/models HOST string 127.0.0.1 PORT integer 5000 git clone https://github.com/team-telnyx/telnyx-code-examples.git cd telnyx-code-examples/build-rag-with-telnyx-inference-python cp .env.example .env pip install -r requirements.txt python app.py Ask a question. The app retrieves relevant in-memory support docs and answers using only that context. curl -X POST http://localhost:5000/rag/ask \ -H "Content-Type: application/json" \ -d '{ "question": "Production signup broke after rotating an API key. Logs show 401 errors. What should we check first?" }' Response: { "answer": "Check that production services are using the new active API key and that the key has the required permissions. Also verify no old key is cached in deployment secrets.", "model": "moonshotai/Kimi-K2.6", "embedding model": "thenlper/gte-large", "sources": {"title": "API Key Authentication", "score": 0.9123}, {"title": "Verification Message Delivery", "score": 0.7811} } Returns the sample knowledge base. Returns service status, configured models, and document count. | Issue | Cause | Fix | |---|---|---| 401 Unauthorized | Invalid or missing Telnyx API key | Verify TELNYX API KEY in .env | | Slow first request | The app creates document embeddings lazily | First request may take longer; later requests reuse embeddings in memory | | Weak answers | Sample knowledge base is too small | Add more documents or replace DOCUMENTS with your own content | Run LLM Inference Python https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/run-llm-inference-python/README.md Extract Structured JSON with AI Python https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/extract-structured-json-with-ai-python/README.md AI Assistant Knowledge Base Python https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/ai-assistant-knowledge-base-python/README.md Telnyx is an AI Communications Infrastructure platform - voice, messaging, SIP, AI, and IoT on one private, global network.