{"slug": "build-a-simple-rag-app-with-telnyx-ai-inference", "title": "Build a Simple RAG App with Telnyx AI Inference", "summary": "Telnyx released a tutorial for building a retrieval-augmented generation (RAG) API using its AI Inference and Embeddings services. The Python/Flask app embeds user questions, retrieves relevant documents, and generates grounded answers via chat completions. It aims to help developers integrate RAG with Telnyx's AI infrastructure.", "body_md": "| name | build-rag-with-telnyx-inference | |\n|---|---|---|\n| title | Build RAG with Telnyx Inference | |\n| description | Build a retrieval-augmented generation API with Telnyx embeddings and chat completions. | |\n| language | python | |\n| framework | flask | |\n| telnyx_products |\n|\n\nBuild a retrieval-augmented generation API with Telnyx embeddings and chat completions.\n\n**Embeddings**:`POST /v2/ai/embeddings`\n\n- create vectors for documents and questions**AI Inference**:`POST /v2/ai/chat/completions`\n\n-[API reference](https://developers.telnyx.com/api/inference/chat-completions)\n\n```\n  User question\n        |\n        v\n  Embed question with Telnyx\n        |\n        v\n  Compare against document embeddings\n        |\n        v\n  Send retrieved context to Telnyx AI\n        |\n        v\n  Grounded answer + source titles\n```\n\nCopy `.env.example`\n\nto `.env`\n\nand fill in:\n\n| Variable | Type | Example | Required | Description | Where to get it |\n|---|---|---|---|---|---|\n`TELNYX_API_KEY` |\n`string` |\n`KEY0123456789ABCDEF` |\nyes |\nTelnyx API v2 key |\n|\n\n`AI_MODEL`\n\n`string`\n\n`moonshotai/Kimi-K2.6`\n\n[Models](https://developers.telnyx.com/docs/inference/models)`EMBEDDING_MODEL`\n\n`string`\n\n`thenlper/gte-large`\n\n[Models](https://developers.telnyx.com/docs/inference/models)`HOST`\n\n`string`\n\n`127.0.0.1`\n\n`PORT`\n\n`integer`\n\n`5000`\n\n```\ngit clone https://github.com/team-telnyx/telnyx-code-examples.git\ncd telnyx-code-examples/build-rag-with-telnyx-inference-python\ncp .env.example .env\npip install -r requirements.txt\npython app.py\n```\n\nAsk a question. The app retrieves relevant in-memory support docs and answers using only that context.\n\n```\ncurl -X POST http://localhost:5000/rag/ask \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"question\": \"Production signup broke after rotating an API key. Logs show 401 errors. What should we check first?\"\n  }'\n```\n\n**Response:**\n\n```\n{\n  \"answer\": \"Check that production services are using the new active API key and that the key has the required permissions. Also verify no old key is cached in deployment secrets.\",\n  \"model\": \"moonshotai/Kimi-K2.6\",\n  \"embedding_model\": \"thenlper/gte-large\",\n  \"sources\": [\n    {\"title\": \"API Key Authentication\", \"score\": 0.9123},\n    {\"title\": \"Verification Message Delivery\", \"score\": 0.7811}\n  ]\n}\n```\n\nReturns the sample knowledge base.\n\nReturns service status, configured models, and document count.\n\n| Issue | Cause | Fix |\n|---|---|---|\n`401 Unauthorized` |\nInvalid or missing Telnyx API key | Verify `TELNYX_API_KEY` in `.env` |\n| Slow first request | The app creates document embeddings lazily | First request may take longer; later requests reuse embeddings in memory |\n| Weak answers | Sample knowledge base is too small | Add more documents or replace `DOCUMENTS` with your own content |\n\n[Run LLM Inference (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/run-llm-inference-python/README.md)[Extract Structured JSON with AI (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/extract-structured-json-with-ai-python/README.md)[AI Assistant Knowledge Base (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/ai-assistant-knowledge-base-python/README.md)\n\nTelnyx is an **AI Communications Infrastructure** platform - voice, messaging, SIP, AI, and IoT on one private, global network.", "url": "https://wpnews.pro/news/build-a-simple-rag-app-with-telnyx-ai-inference", "canonical_source": "https://github.com/team-telnyx/telnyx-code-examples/tree/main/build-rag-with-telnyx-inference-python", "published_at": "2026-06-26 15:19:05+00:00", "updated_at": "2026-06-26 15:36:09.250832+00:00", "lang": "en", "topics": ["artificial-intelligence", "ai-tools", "ai-infrastructure", "developer-tools", "natural-language-processing"], "entities": ["Telnyx", "Flask", "Python", "Kimi-K2.6", "gte-large", "Telnyx API"], "alternates": {"html": "https://wpnews.pro/news/build-a-simple-rag-app-with-telnyx-ai-inference", "markdown": "https://wpnews.pro/news/build-a-simple-rag-app-with-telnyx-ai-inference.md", "text": "https://wpnews.pro/news/build-a-simple-rag-app-with-telnyx-ai-inference.txt", "jsonld": "https://wpnews.pro/news/build-a-simple-rag-app-with-telnyx-ai-inference.jsonld"}}