Databricks at SIGMOD 2026

Databricks will present its work on Spark Declarative Pipelines at SIGMOD 2026, where the paper has received an honorable mention award. The company's Enzyme system introduces innovations in incrementally maintaining materialized views to simplify extract-transform-load (ETL) workloads at scale. Databricks is attending the conference as a Platinum Sponsor in Bangalore, India, a key R&D hub for the company.

by Indrajit Roy /blog/author/indrajit-roy Databricks continues to lead the way in engineering innovation, consistently pushing the boundaries of what’s possible in the Data and AI space. We are thrilled to announce that our work on Spark Declarative Pipelines will be featured at SIGMOD 2026, and has received an honorable mention award https://2026.sigmod.org/sigmod awards.shtml at the conference. We’re headed to SIGMOD https://2026.sigmod.org/ , this upcoming June 1-5 as a Platinum Sponsor. SIGMOD will take place in Bangalore, India which is also a large Databricks R&D hub https://www.databricks.com/company/newsroom/press-releases/databricks-invest-over-us250-million-india-over-next-three-years . Our upcoming papers on data engineering show how Databricks has simplified incremental processing for customers. There are two ways to write incremental programs in Spark Declarative Pipelines SDP , and customers can mix-and-match these within a pipeline: Here’s a sneak peak at the Enzyme paper and what the team has been working on: Enzyme at SIGMOD 2026 Let’s say you are an analyst in a company and want to analyze the total number of orders sold in a region. The materialized view below provides the answer. CREATE MATERIALIZED VIEW order report as SELECT region, sum orders FROM customer and order table GROUP by region As new orders are added, you expect the materialized view to remain up to date. This data maintenance is essentially the incremental view maintenance problem. While keeping the above toy MV updated seems simple, imagine if the MV needed to join data across multiple tables or had window functions or made calls to LLM functions. Materialized views MVs are popular for query acceleration– speeding up dashboards on data residing in data warehouses. When creating Spark Declarative Pipelines, we decided to go beyond query acceleration and apply materialized views to the extract-transform-load ETL use cases. Our key observation is that if MVs can be efficiently and incrementally maintained, it will significantly simplify ETL workloads which otherwise require writing complex custom code. Enzyme adds to the rich literature on incrementally maintaining materialized views and demonstrates how to scale these techniques on production workloads. Some of the innovations that the team worked on are: Figure 1: Enzyme has significantly better performance than another competing industry solution name anonymized to CV-IVM due to licensing restrictions . Interested in learning more? Check out the paper https://arxiv.org/abs/2603.27775 and if you're at SIGMOD, attend our talk https://2026.sigmod.org/sigmod program detailed.shtml i-1 for more details. Stop by our booth to meet the team and learn more about the innovation that is happening at Databricks. Plus, don’t miss the chance to hear directly from Ritwik Yadav, during his presentation at SIGMOD Subscribe to our blog and get the latest posts delivered to your inbox.