{"slug": "can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time", "title": "Can Databricks LTAP Actually Run OLTP and OLAP at the Same Time?", "summary": "Databricks announced LTAP (Lake Transactional/Analytical Processing) at Data + AI Summit, aiming to unify OLTP and OLAP on a single storage layer by writing directly into Delta and Iceberg formats via copy-on-write. However, LTAP is still 'coming soon' and Lakehouse//RT is in beta, raising questions about consistency and latency under load. A developer from Regatta, which builds a competing product, notes that while LTAP removes the CDC pipeline, it does not merge the execution engines, leaving a potential consistency gap between Postgres ACID and Delta snapshot isolation.", "body_md": "Databricks announced LTAP, Lake Transactional/Analytical Processing, at Data + AI Summit this week. The pitch is clean: write transactional data once, and have it usable for analytics without a CDC pipeline in between. I read through what's public and want to dig into the parts the keynote slide doesn't answer.\n\nQuick disclosure up front: I run Regatta. We build RegattaDB, which is going after the same root problem, no separate OLTP, OLAP, and vector stacks. I'm not a neutral party here. But the technical questions below are ones any architect evaluating this should be asking regardless of who's doing the asking.\n\nLakebase, Databricks' serverless Postgres service built on object storage is GA. LTAP is the architectural layer on top of it. Instead of writing rows in Postgres format and running CDC to convert them into columnar format for the Lakehouse later, Lakebase now writes directly into Delta and Iceberg formats at the point of write, via copy-on-write. Same physical data, from the moment it lands, for both sides.\n\nWorth checking the status before you architect around it: LTAP itself is listed as \"coming soon.\" Lakehouse//RT, the compute engine behind the sub-100ms analytical query claim, is in beta. If you're designing agent infrastructure today against this announcement, you're designing against unshipped product.\n\nOLTP wants low-latency random I/O for point reads and writes. OLAP wants high-throughput sequential scans across large datasets. Vector search wants approximate nearest-neighbor lookups over high-dimensional embeddings. Each of these wants a different data layout to perform well, and optimizing one tends to hurt the other two. That's the actual reason CDC pipelines and zero-ETL connectors exist, not because anyone loves pipelines, but because keeping the engines separate and syncing them was the only way to avoid resource contention. It doesn't resolve the underlying tension. It relocates it to a sync boundary.\n\nLTAP doesn't merge the two engines. Postgres still handles transactions. Spark and the Lakehouse still handle analytics. What changes is the storage layer underneath both of them, copy-on-write buffering removes the pipeline that used to connect them. That's a legitimate engineering contribution to the \"stop paying the CDC tax\" problem. It is not the same thing as a single execution engine.\n\nHere's the path, as best I can reconstruct it from public material: Postgres writes to WAL and buffer cache first. A caching layer converts rows to columnar format. Data then lands in Delta or Iceberg in object storage.\n\nThat conversion step is doing a lot of work, and it sits between two separate consistency domains, Postgres ACID on one side, Delta snapshot isolation on the other. The gap between them is a flush window, and as far as I can tell, it's unbounded in anything Databricks has published.\n\nConcretely: a long-running Spark query reads a snapshot as of when it starts. A transaction that commits in Postgres after that point being invisible to the query is expected and fine. The harder case is a transaction that committed *before* the query started but hasn't finished converting and flushing yet. Is the analytical result just stale, or is it actually wrong? Databricks describes data as \"immediately queryable\" without defining what immediately means under sustained load. That's not a gotcha, it's a genuinely open question, and I'd want a number before I built agent logic on top of it.\n\nThis matters specifically for agents. An agent that reads live transactional state, joins it against historical data, and writes back a decision is crossing an engine boundary regardless of how the storage underneath is organized. [LTAP ](https://regatta.dev/blog/databricks-ltap-storage-unification-is-not-enough)removes the *data movement* cost of that crossing. It doesn't remove the crossing itself.\n\nThe published number for Lakehouse//RT is against TPC-H Q6: a single-table scan with a filter and a simple aggregation. That's a reasonable test of scan speed, predicate pushdown, and vectorized execution. It does not test joins. There's no published number yet for multi-table analytical queries running concurrently with high-frequency transactional writes, which is the workload that actually matters for the unification claim.\n\nLTAP addresses OLTP and OLAP at the storage layer. Vector search is still an extension bolted onto the Postgres side, with the contention and single-node ceiling that come with bolting a high-dimensional ANN workload onto a row-oriented transactional engine. All three workloads, transactional, analytical, semantic, have genuinely different compute profiles. Solving two of three at the storage layer is real progress. It's not the full problem.\n\nFull disclosure again: this is the part where I'll tell you what we did differently, because we made the opposite bet, that you can't bolt this together, the concurrency model has to be designed for all three workloads from day one.\n\nA few of the specific decisions that came out of that constraint:\n\nThat gives you one schema, one connection, one concurrency model across all three workload types, not three systems sharing a storage layer.\n\nLTAP is the right diagnosis, fragmented data is the bottleneck for agent systems. It just stops short of a unified execution model. If anyone's dug further into the LTAP docs than what's public so far, I'd genuinely like to see it, and if Databricks publishes the flush bound, I'll update this.", "url": "https://wpnews.pro/news/can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time", "canonical_source": "https://dev.to/itay_waisman_fc21bb7d2a3a/can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time-4ia7", "published_at": "2026-06-30 11:09:18+00:00", "updated_at": "2026-06-30 11:19:45.371358+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-infrastructure", "ai-agents", "developer-tools"], "entities": ["Databricks", "Regatta", "RegattaDB", "Lakebase", "Lakehouse//RT", "Postgres", "Delta", "Iceberg"], "alternates": {"html": "https://wpnews.pro/news/can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time", "markdown": "https://wpnews.pro/news/can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time.md", "text": "https://wpnews.pro/news/can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time.txt", "jsonld": "https://wpnews.pro/news/can-databricks-ltap-actually-run-oltp-and-olap-at-the-same-time.jsonld"}}