# Build a Simple RAG App with Telnyx AI Inference

> Source: <https://github.com/team-telnyx/telnyx-code-examples/tree/main/build-rag-with-telnyx-inference-python>
> Published: 2026-06-26 15:19:05+00:00

| name | build-rag-with-telnyx-inference | |
|---|---|---|
| title | Build RAG with Telnyx Inference | |
| description | Build a retrieval-augmented generation API with Telnyx embeddings and chat completions. | |
| language | python | |
| framework | flask | |
| telnyx_products |
|

Build a retrieval-augmented generation API with Telnyx embeddings and chat completions.

**Embeddings**:`POST /v2/ai/embeddings`

- create vectors for documents and questions**AI Inference**:`POST /v2/ai/chat/completions`

-[API reference](https://developers.telnyx.com/api/inference/chat-completions)

```
  User question
        |
        v
  Embed question with Telnyx
        |
        v
  Compare against document embeddings
        |
        v
  Send retrieved context to Telnyx AI
        |
        v
  Grounded answer + source titles
```

Copy `.env.example`

to `.env`

and fill in:

| Variable | Type | Example | Required | Description | Where to get it |
|---|---|---|---|---|---|
`TELNYX_API_KEY` |
`string` |
`KEY0123456789ABCDEF` |
yes |
Telnyx API v2 key |
|

`AI_MODEL`

`string`

`moonshotai/Kimi-K2.6`

[Models](https://developers.telnyx.com/docs/inference/models)`EMBEDDING_MODEL`

`string`

`thenlper/gte-large`

[Models](https://developers.telnyx.com/docs/inference/models)`HOST`

`string`

`127.0.0.1`

`PORT`

`integer`

`5000`

```
git clone https://github.com/team-telnyx/telnyx-code-examples.git
cd telnyx-code-examples/build-rag-with-telnyx-inference-python
cp .env.example .env
pip install -r requirements.txt
python app.py
```

Ask a question. The app retrieves relevant in-memory support docs and answers using only that context.

```
curl -X POST http://localhost:5000/rag/ask \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Production signup broke after rotating an API key. Logs show 401 errors. What should we check first?"
  }'
```

**Response:**

```
{
  "answer": "Check that production services are using the new active API key and that the key has the required permissions. Also verify no old key is cached in deployment secrets.",
  "model": "moonshotai/Kimi-K2.6",
  "embedding_model": "thenlper/gte-large",
  "sources": [
    {"title": "API Key Authentication", "score": 0.9123},
    {"title": "Verification Message Delivery", "score": 0.7811}
  ]
}
```

Returns the sample knowledge base.

Returns service status, configured models, and document count.

| Issue | Cause | Fix |
|---|---|---|
`401 Unauthorized` |
Invalid or missing Telnyx API key | Verify `TELNYX_API_KEY` in `.env` |
| Slow first request | The app creates document embeddings lazily | First request may take longer; later requests reuse embeddings in memory |
| Weak answers | Sample knowledge base is too small | Add more documents or replace `DOCUMENTS` with your own content |

[Run LLM Inference (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/run-llm-inference-python/README.md)[Extract Structured JSON with AI (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/extract-structured-json-with-ai-python/README.md)[AI Assistant Knowledge Base (Python)](https://raw.githubusercontent.com/team-telnyx/telnyx-code-examples/main/ai-assistant-knowledge-base-python/README.md)

Telnyx is an **AI Communications Infrastructure** platform - voice, messaging, SIP, AI, and IoT on one private, global network.
