Observing LLM Applications with OpenTelemetry OpenTelemetry, an open-source observability framework, is being adopted to monitor non-deterministic outputs and performance issues in LLM-based applications. The technology addresses challenges like hallucinations, inconsistent responses, and provider-side latency spikes that arise when integrating large language models into production systems. Developers can use OpenTelemetry's standardized instrumentation to collect telemetry data without vendor lock-in, enabling backend-agnostic monitoring of AI features. Observing LLM Applications with OpenTelemetry Ever since OpenAI launched ChatGPT in November 2022, AI usage has exploded worldwide. Integrating LLMs into applications began soon after, rapidly going from an experimental, nice-to-have feature to a competitive, baseline requirement. And while you can find an AI implementation in almost every product today, shipping production-ready LLM features introduces its own set of challenges that developers must contend with. In this article, we’ll dive into why observing LLM-based applications is now a critical requirement, what OpenTelemetry is, and how to integrate it into your applications with a practical demo. During this process, we will also look at the current maturity level of LLM-specific OpenTelemetry libraries, the GenAI Semantic Conventions, and some practical challenges you can face while instrumenting your LLM applications. Why do LLM applications need observability? If you are already familiar with the challenges of maintaining LLM applications across their lifecycle, feel free to skip to the next section what-is-opentelemetry that discusses OpenTelemetry. Handling non-determinism Now you might think that observing your LLM integrations is not that different from classic observability. The key difference is that the output generated by LLMs is non-deterministic : the same input can produce completely different outputs across runs. Developers often equip LLMs with dedicated tools since models can hallucinate unpredictably on tasks that require precise, deterministic output. Ensuring context-appropriate responses Non-determinism does not mean that the responses are actually incorrect. In most scenarios though, developers likely want their responses to be structured in a certain way. For example, while the response "very likely" for a query like "chances of rain tomorrow" might be suitable, the same response for a query like "chances of stock market climbing tomorrow" might be unacceptable, where the user likely expects more nuance from the application system. Ensuring that responses remain consistent across a range of user queries is one of the key factors that separates a polished LLM product from an unreliable one. Managing quality across updates LLM providers frequently release model updates, modify their backends, and provide optimal usage guides. Meanwhile, developers also experiment with model configurations and share the ones which work for them. All in all, the space is developing quickly, and each of these factors can affect the response quality of your LLM setup. As a practical example, LLM providers can suffer "brown-outs" where their infrastructure cannot keep up with user demand, leading to latency spikes, timeouts, or even degraded response quality in certain scenarios, making it critical to observe how your LLM setup holds up over time. What is OpenTelemetry? OpenTelemetry https://signoz.io/opentelemetry/ OTel is a Cloud Native Computing Foundation CNCF project aimed at standardizing the way we instrument applications for generating telemetry data. Before OpenTelemetry arrived, telemetry data lived in silos and often had little or no correlation between signals. It follows a specification-driven development https://github.com/open-telemetry/opentelemetry-specification?tab=readme-ov-file model that standardizes telemetry generation and collection details, meaning any compatible backend can process and visualize telemetry data emitted via its SDKs. As there is no need to rewrite the entire instrumentation plumbing each time you change observability backends, there is no vendor lock-in . Implementing OpenTelemetry in LLM Applications Prerequisites - Python 3.12 or newer. Download the latest version https://www.python.org/downloads/ . - A SigNoz Cloud account https://signoz.io/teams/ for visualizing the telemetry data. - An OpenAI API key https://platform.openai.com/api-keys to use with the application. - An API client like Postman https://www.postman.com/ or Bruno https://www.usebruno.com/ for managing API payloads and visualizing responses. While earlier Python versions like 3.10 may technically work, they are nearing their end of life https://devguide.python.org/versions/ supported-versions . Python 3.12 will continue to receive security updates till late 2028. Setting up SigNoz SigNoz is an OpenTelemetry-native observability platform that provides logs, traces, and metrics in a unified platform. Sign up https://signoz.io/teams/ for a free SigNoz Cloud account. Follow the documentation https://signoz.io/docs/ingestion/signoz-cloud/keys/ to create ingestion keys for your account.- Ensure the region and ingestion key values are readily accessible for the following steps. Once done, you’re ready to configure the application and point it towards your SigNoz instance. Running the Demo Application Application Setup Clone the SigNoz Examples repository and navigate to the application folder: git clone https://github.com/SigNoz/examples.git cd examples/python/opentelemetry-llm-demo Create and activate a Python virtual environment. python3.12 -m venv .venv source .venv/bin/activate The requirements.txt file contains all the necessary OpenTelemetry Python https://signoz.io/docs/instrumentation/opentelemetry-python/ packages. Install them by running: python -m pip install -r requirements.txt The following dependencies enable the OpenTelemetry instrumentation process: opentelemetry-distro : This provides a convenient mechanism to automatically configure some of the more common options for users, helping us get started with OpenTelemetry auto-instrumentation quickly. opentelemetry-exporter-otlp : This package installs the OTLP https://signoz.io/blog/what-is-otlp/ exporters required to transmit telemetry data to any OpenTelemetry backend https://signoz.io/blog/opentelemetry-backend/ . The following command detects standard libraries or frameworks such as FastAPI used in our application, and installs their respective instrumentation libraries: opentelemetry-bootstrap --action=install Finally, we will configure our environment variables and start the application, wrapping the entrypoint within opentelemetry-instrument to auto-instrument our application code. OPENAI API KEY="