{"slug": "bring-production-agent-traces-from-arize-into-databricks-unity-catalog", "title": "Bring production agent traces from Arize into Databricks Unity Catalog", "summary": "Arize Data Fabric now supports Databricks, enabling teams to sync production agent traces, evaluations, and annotations into customer-owned cloud storage and register them in Unity Catalog. This integration allows Databricks teams to query trace data alongside business tables, product events, and customer records within the lakehouse, eliminating the need for custom pipelines or data exports. The move addresses the challenge of analyzing agent failures in isolation by joining observability data with enterprise context for more effective production analysis.", "body_md": "Production traces are most useful when teams can query them with the rest of their data.\n\nThat becomes harder when traces, [evaluations](https://arize.com/docs/ax/get-started/get-started-evaluations), and annotations live only inside an observability system. Agent failures often require context from business tables, product events, customer records, model metadata, prompts, release history, and human review workflows. When observability data sits outside the lakehouse, teams have to export data, maintain custom pipelines, or debug production behavior from a system that cannot easily join against the data that explains what happened.\n\nToday, [Arize Data Fabric](https://arize.com/docs/ax/security-and-settings/data-fabric#databricks) adds support for Databricks. With this integration, Arize can sync production agent traces, evaluations, and annotations into customer-owned cloud storage. Databricks teams can then register that data in [Unity Catalog](https://arize.com/blog/harnessing-databricks-mosaic-ai-agent-framework-and-arize-for-next-level-genai-applications/), apply governance controls, and query it alongside the rest of their lakehouse data.\n\nThis gives teams a more direct path from production observability to production analysis. Trace data remains accessible in customer-owned storage, available in open table formats, and governed through the same data platform teams already use for analytics, ML, and AI applications.\n\n**Why trace data belongs in the lakehouse**\n\nAgent traces are operational records of how an AI system behaves in production. A trace can show which prompt ran, which model responded, which tools were called, which retrieval results were used, how long each step took, which evaluations ran, and how a human reviewer annotated the output.\n\nThat data becomes more useful when teams can join it with the rest of the business.\n\nA support agent trace can be analyzed with ticket metadata, account tier, CRM data, escalation history, and customer health scores. A shopping agent trace can be analyzed with product catalog data, conversion events, refund data, inventory, and revenue. A coding assistant trace can be analyzed with repository metadata, pull request outcomes, CI failures, and user feedback.\n\nThose joins help teams move beyond isolated debugging. They can understand which failures matter, where they happen, which users or customers are affected, and whether a fix improved production behavior.\n\nArize [Data Fabric for Databricks](https://arize.com/blog/data-fabric-querying-agent-traces-in-bigquery/) is designed to bring that production context into the lakehouse.\n\n**How Arize and Databricks fit together**\n\nDatabricks gives teams a governed environment for data, analytics, ML, and AI workloads. Unity Catalog provides governance for data assets, including access controls, auditability, and lineage.\n\nArize captures production behavior from AI applications and agents, including traces, spans, model inputs and outputs, tool calls, evaluation results, feedback, and human annotations.\n\nWith Arize Data Fabric, those production signals can flow into the same lakehouse architecture teams already use to analyze data and improve AI systems. Arize syncs trace and evaluation data to customer-owned cloud storage. Databricks teams can then use Unity Catalog to register, govern, and query that data with the rest of their enterprise data.\n\n**How the sync works**\n\nArize Data Fabric syncs production trace data, evaluations, and annotations to cloud storage every 60 minutes. Each Arize project is written to its own table path, so teams can choose which projects to expose in Databricks and how to organize them in Unity Catalog.\n\nA typical setup looks like this:\n\n- Create a Data Fabric connector in Arize.\n- Point the connector at your cloud storage bucket and select the Arize projects to sync.\n- Grant Arize the required access to the bucket and prefix.\n- Validate the connector and start the first sync.\n- Register the synced data in Unity Catalog using the setup flow documented by Arize.\n\nOnce the first sync completes, trace data is available in the storage location you configured. From there, Databricks teams can register the data in Unity Catalog, apply permissions, query it from Databricks SQL, and use it in downstream workflows.\n\nFor the exact setup steps, including cloud storage permissions and Unity Catalog registration, use the Arize Data Fabric documentation.\n\n**Query production agent behavior with Databricks SQL**\n\nOnce traces are available through Unity Catalog, teams can analyze production agent behavior with the same SQL workflows they already use for lakehouse data.\n\nThe exact schema depends on your Arize project, trace structure, and configured attributes. The queries below show the kind of analysis teams can run once trace fields, evaluation results, and business identifiers are available in the synced tables.\n\nFor example, a team could look for prompt and evaluator combinations associated with failed traces:\n\n```\nSELECT\n  prompt_template,\n  evaluator_name,\n  COUNT(*) AS total_traces,\n  SUM(CASE WHEN eval_result = 'fail' THEN 1 ELSE 0 END) AS failed_traces\nFROM my_catalog.arize_traces.production_agent\nWHERE start_time >= current_date() - INTERVAL 7 DAYS\nGROUP BY prompt_template, evaluator_name\nORDER BY failed_traces DESC;\n```\n\nTrace data becomes more useful when it is joined with business data:\n\n```\nSELECT\n  c.account_tier,\n  t.tool_name,\n  COUNT(*) AS tool_calls,\n  AVG(t.latency_ms) AS avg_latency_ms,\n  SUM(CASE WHEN t.eval_result = 'fail' THEN 1 ELSE 0 END) AS failed_runs\nFROM my_catalog.arize_traces.production_agent t\nJOIN my_catalog.salesforce.customers c\n  ON t.customer_id = c.customer_id\nWHERE t.start_time >= current_date() - INTERVAL 30 DAYS\nGROUP BY c.account_tier, t.tool_name\nORDER BY failed_runs DESC;\n```\n\nThese examples are illustrative. In practice, teams can adapt the query pattern to match their synced Arize schema, application metadata, and lakehouse tables.\n\nThe value is the join. Teams can analyze which tools fail most often, which customer segments are affected, whether a prompt change increased latency, or which evaluation failures correlate with downstream business impact.\n\n**Use Genie Spaces for natural language trace analysis**\n\nNot every stakeholder who needs trace data wants to write SQL.\n\nWhen Arize trace data is registered in Unity Catalog, teams can make that data available through Databricks surfaces such as Genie Spaces and Databricks SQL. A Genie Space lets users ask natural-language questions about curated Unity Catalog datasets and receive SQL-backed answers, result tables, and visualizations.\n\nThat can help product managers, analysts, support leaders, and operations teams explore agent behavior without starting from a blank SQL editor.\n\nUseful questions might include:\n\n- Which prompts had the highest evaluation failure rate last week?\n- Which tool calls added the most latency yesterday?\n- Which customer accounts saw the most failed agent runs this month?\n- Did the latest release reduce failed checkout-agent traces?\n- Which retrieval sources appear most often in low-quality responses?\n\nGenie Spaces work best when the underlying datasets are well documented and curated. Teams should define the relevant tables, views, joins, column descriptions, and business context so natural-language questions map to accurate SQL.\n\nBecause the trace data lives alongside product, customer, and revenue data, these questions can connect agent behavior to operational outcomes.\n\n**Turn production traces into agent improvement workflows**\n\nProduction traces can also help teams improve agents.\n\nTeams building agents on Databricks can use synced Arize data to identify failures, build datasets, compare variants, and evaluate changes against real production behavior. A failed production run can become a regression example. A high-quality human annotation can become evaluation data. A recurring tool failure can become the basis for an experiment.\n\nProduction traces and evaluations can help teams:\n\n- Find recurring failure patterns across tools, prompts, retrieval steps, and agent workflows.\n- Build datasets from real production examples.\n- Compare new agent variants against historical failures.\n- Use evaluation results to define MLflow experiments and compare changes.\n- Track whether a prompt, tool, model, or retrieval change improved production quality.\n\nThe goal is to make production behavior reusable. Teams should be able to take what Arize observes in production and apply it to the workflows they use to improve agents.\n\n**Govern agent observability data with Unity Catalog**\n\nAgent traces can contain sensitive operational context, including user inputs, retrieved documents, tool responses, and model outputs. That makes governance a core requirement for production agent observability.\n\nBy registering Arize Data Fabric tables in Unity Catalog, teams can apply familiar governance controls to trace and evaluation data. Access policies, audit logs, and lineage can follow the same patterns teams already use for analytics, ML, and AI workloads.\n\nThat matters because agent observability data often needs to be shared across teams. Engineers need traces for debugging. ML teams need evaluation data for experiments. Product teams need quality and latency trends. Support teams need visibility into customer-impacting failures. Security and governance teams need confidence that access to sensitive data is controlled.\n\nWith Unity Catalog, those workflows can operate from a shared governed layer instead of a separate export pipeline.\n\n**Built on open data in customer-owned storage**\n\nArize Data Fabric is designed around a simple architectural principle: production observability data should remain accessible in customer-owned infrastructure.\n\nTrace and evaluation data is written to cloud storage controlled by the customer. Databricks teams can register, query, and govern that data through Unity Catalog using the Data Fabric setup flow. From there, teams can analyze it with Databricks SQL, expose curated datasets through Genie Spaces, build dashboards, and use the data in downstream ML and agent-improvement workflows.\n\nFor teams running agents on Databricks, this brings production observability data into the same environment where they already build, govern, and improve AI systems.\n\n**Get started**\n\nArize Data Fabric for Databricks is rolling out to enterprise customers now.\n\nReach out to your Arize account team or email support@arize.com to join the waitlist. For setup details, see [the Arize Data Fabric documentation](https://arize.com/docs/ax/security-and-settings/data-fabric#databricks).", "url": "https://wpnews.pro/news/bring-production-agent-traces-from-arize-into-databricks-unity-catalog", "canonical_source": "https://arize.com/blog/arize-data-fabric-databricks-unity-catalog-agent-traces/", "published_at": "2026-06-11 17:00:03+00:00", "updated_at": "2026-06-11 18:35:46.639158+00:00", "lang": "en", "topics": ["ai-agents", "ai-infrastructure", "mlops", "large-language-models", "generative-ai"], "entities": ["Arize", "Databricks", "Unity Catalog", "Arize Data Fabric"], "alternates": {"html": "https://wpnews.pro/news/bring-production-agent-traces-from-arize-into-databricks-unity-catalog", "markdown": "https://wpnews.pro/news/bring-production-agent-traces-from-arize-into-databricks-unity-catalog.md", "text": "https://wpnews.pro/news/bring-production-agent-traces-from-arize-into-databricks-unity-catalog.txt", "jsonld": "https://wpnews.pro/news/bring-production-agent-traces-from-arize-into-databricks-unity-catalog.jsonld"}}