Vector database and lakebase supplier Zilliz announced a new storage engine called Loon to support online search, offline analysis, backfills, compaction, and external compute without constantly copying, rewriting, or reimporting data. Why is it needed?
Loon serves real-time search, large-scale discovery, and batch analytics from a single copy of vector data on low-cost object storage. Zilliz says that it's “the storage layer behind Zilliz Cloud's evolution from a vector database into a unified data platform for AI.” The storage engine has to behave like two systems at once: fast, record-level lookups for serving and wide scans for analytics. The data it holds is not static, as admin teams re-embed, re-label, and re-index the same records as their models improve.
The Loon engine is based on three notions. Firstly, it recognizes hybrid file formats and database columns use different formats;
Paquet is used for scalar and filter fields, with scan efficiency,
Vortex is used for dense and sparse vectors, with fast, byte-precise row-level reads on object storage,
External object storage holds raw videos, PDFs, and images, referenced instead of being copied into the database.
Secondly, database columns “split across different formats still behave as one logical table, so a new embedding model can be added as its own column without rewriting the captions, metadata, or vectors already stored.” The columns have a shared row ID space.
The third notion is that the dataset’s current version is defined by a single source of truth, a versioned manifest detailing its files, indexes, delete logs, and statistics. Zilliz says “serving clusters, on-demand compute, and external engines such as Spark and Ray can all read and safely update the same dataset instead of maintaining separate copies.”
The way Loon organizes data means that data accesses can be much more efficient. Zilliz's internal testing of object storage showed that Loon's Vortex-based layout reduced the data pulled per record read by about 135x compared to Parquet. It also says that “adding a new embedding model becomes a lightweight version update rather than a multi-hundred-gigabyte rewrite.”
A deep dive blog by Zillliz principal engineer Ted Xu explores some of the thinking behind Loon’s development. One idea is that a vector database is not a relatively static store. It actively evolves and the up of a long video to object storage is used to illustrate this;
In the first week, the table may only contain clip_id, video_id, start_offset, and duration.
In the second week, the team adds aesthetic_score.
In the third week, a captioning model runs, and each clip gets a caption.
In the fourth week, the first embedding model goes online, and each clip gets a 768-dimensional CLIP embedding.
A month later, the team switches models and backfills embedding_v2, now with 1024 dimensions.
Two months later, hybrid search becomes a requirement, so the team adds a sparse vector column.
Three months later, captions undergo human review and must be corrected in place.
The dataset was never completed. It kept accumulating new interpretations of the same underlying rows. The same row gets reprocessed again and again. And scale turns this from an inconvenience into a storage problem: multimodal datasets are often not millions of records but hundreds of millions or billions.
Xu says: “Suppose a dataset has 100 million video clips. Adding a new 1024-dimensional fp32 embedding column means writing roughly 400 GB of raw vector data. That does not include statistics, indexes, metadata updates, object storage overhead, validation, or serving-path integration. …Ship a column like this every month and schema evolution becomes a recurring terabyte-scale data-engineering job.”
There is also a write amplification problem in vector storage, as a small logical update can trigger a large physical rewrite. eg; “A human review job may only correct a few hundred bytes in a caption. But if the caption, dense vector, sparse vector, and other derived features share the same physical file lifecycle, the system may end up rewriting the vectors too. The logical change is small. The physical I/O can be huge.”
Read the blog to learn more about the issues that drove Zilliz’ engineers to develop Loon.
Loon is the engine inside Milvus 3.0 and serves as the storage layer for Zilliz Vector Lakebase on Zilliz Cloud, which is available across 30-plus regions on AWS, Google Cloud, and Microsoft Azure.