Milvus invents Vector Lakebase

wpnews.pro

cd /news/ai-infrastructure/milvus-invents-vector-lakebase · home › topics › ai-infrastructure › article

[ARTICLE · art-27961] src=blocksandfiles.com ↗ pub=2026-06-15T12:51Z topic=ai-infrastructure verified=true sentiment=↑ positive

Milvus invents Vector Lakebase

Zilliz, the Milvus AI vector database supplier, launched Vector Lakebase, a product that extends real-time vector search with an external data lake connector, batch analytics, and interactive discovery. The new offering enables AI developers to run production queries, discovery sessions, and training-data pipelines on a unified S3-based data foundation without data copies or migration.

read4 min views27 publishedJun 15, 2026

Zilliz, the Milvus AI vector database supplier, has a Vector Lakebase product, extending cloud-resident, real-time vector search with an external data lake connector, batch analytics and interactive discovery.

Milvus stores the vector embeddings used in large language model (LLM) natural language and AI agent retrieval-augmented generation (RAG) workloads. This is its product vector search workload bread and butter. It’s an open-source vector database, purpose-built for 100-billion-scale vector search, with 44,000+ GitHub stars and over 100 million Docker pulls. It’s used by more than 10,000 enterprises and AI-native startups worldwide, including MiniMax, OpenEvidence, Filevine, Exa, Salesforce, and Read AI.

Charles Xie, Zilliz founder and CEO, said: "Production vector search is and will remain at the heart of what Zilliz does — it's why thousands of teams choose Milvus and Zilliz Cloud, and it's getting faster and more cost-efficient every release.

"Vector Lakebase is what we believe comes next: one data foundation where the same vectors can serve a production query, anchor a discovery session, and power a multi-petabyte training-data pipeline — without copies, migration, or a parallel stack.”

It builds on an S3-based unified data foundation.

Robert Guo, VP of Product at Zilliz and one of the architects behind Milvus, said: "Teams asked for a way to keep their data in one place and run very different workloads against it - from real-time agent memory to overnight semantic deduplication. Vector Lakebase delivers that through a unified storage layer on Vortex, tiered serving for the production path, and on-demand compute for everything else.”

Vortex is a Linux Foundation project (previously SpiralDB) and an open-source columnar file format and toolkit optimized for high-performance in AI and analytics workloads. It provides ~100x faster random access, 10-20x faster scans, 5x faster writes, with similar or better compression than Parquet or Lance.

A Guo blog says “AI systems are no longer just a single-query retrieval problem. They operate as a continuous loop of serving, learning, and improving.” As a vector database is used by LLMs and agents the production AI systems generate feedback data, such as user behavior, logs, statistics and agent notes. These are used to uncover new areas for improvement and, through batch analytics, improve training datasets and agent strategies.

Guo says: “AI developers (either manually or through agentic systems) often need to explore feedback data and the underlying corpus to understand why serving quality is poor. They may also run large-scale semantic deduplication and clustering on newly crawled data, then mine edge clusters to discover new training data candidates.”

Up until now the feedback data has been stored separately from the vector database. Guo says: “Interactive discovery and batch analytics are naturally aligned with vector databases at both the data and compute layers.” So “Vector Lakebase accelerates this [feedback] loop through a straightforward but efficient approach: providing a zero-copy semantic data plane that can be efficiently accessed by all three workload modes — real-time retrieval, interactive discovery, and batch analytics.” It has five capabilities enabling this:

Tiered Serving Solutions - Flexible serving tiers optimized for different real-time workloads delivering ultra-high performance, balanced efficiency, and cost-effective scaling across massive datasets.

On-Demand Search - Designed for large-scale workloads where latency is less critical and compute remains idle most of the time, including infrequent search, data exploration, and batch analytics.

External Data Lake Search - Add state-of-the-art indexing and large-scale search capabilities directly to your existing lake data.

Full-Spectrum Search - From vector and text to JSON and geospatial, combined with hybrid retrieval, filtering, and reranking for expressive multi-modal queries.

Unified Lake-Native Storage - Unified storage for both serving and analytics, built on Vortex plus per-column format flexibility and broader data modeling capabilities.

The tiered serving feature has three serving tiers: Performance-Optimized, Capacity-Optimized, and Tiered-Storage. Each tier is built with dedicated indexing algorithms and data placement strategies across the storage hierarchy, offering a range of performance–cost tradeoffs.

This means, we understand, the Vector Lakebase is only available in the Zilliz Cloud and not on-premises.

The Vector Lakebase’s External Collection creates a zero-copy logical mapping from the Zilliz data plane to customer-owned lake tables, while enabling high-performance indexes and full-spectrum search on top of that mapping. Currently, External Collection supports two data lake table formats - Lance and Iceberg, as well as two open data formats - Parquet and Vortex. It has incremental synchronization capabilities. Based on the data lake update pattern and query visibility requirements, users can sync data anytime with a refresh call.

There is a lot more information in Guo’s blog. Zilliz Vector Lakebase is available now in public preview on Zilliz Cloud, alongside Serverless, Dedicated, and BYOC deployment options across more than 30 regions on AWS, Google Cloud, and Microsoft Azure. New work email signups receive $100 in free credits at zilliz.com.

source & further reading

blocksandfiles.com — original article MinIO's AIStor Memory enables agents to inherit organizational memory Kioxia launches KV caching SSD and AI-focussed fingernail drive SK Hynix announces extraordinarily high revenues but misses expectations

~/api · this article 200

$curl api.wpnews.pro/v1/news/milvus-invents-vector-la…

Read original on blocksandfiles.com → www.blocksandfiles.com/ai-ml/2026/06/15/milvus-i…

mentioned entities

Zilliz

Milvus

Vector Lakebase

Charles Xie

Robert Guo

Vortex

Salesforce

MiniMax

metadata

slugmilvus-invents-vector-lakebase

topic#ai-infrastructure

sentimentpositive

canonicalblocksandfiles.com

navigation

← prevNick Reiner trustee agrees to re…

next →Aravind Srinivas: Micron will su…

── more in #ai-infrastructure 4 stories · sorted by recency

blocksandfiles.com · 16 Jul · #ai-infrastructure

Zilliz launches Milvus vector lakebase

milvus.io · 30 Jul · #ai-infrastructure

Milvus 3.0: Lake-Native Vector Search, S3-Based Storage, Offline/Batch Workflows

blocksandfiles.com · 19 Jun · #ai-infrastructure

Zilliz lays out vector database and lakebase differences

zilliz.com · 5 Jun · #ai-infrastructure

A Vector Lakebase is all you need for all AI workloads

── more on @zilliz 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 30 Jul · #artificial-intelligence

Oracle expands AI offerings with access to Google’s Gemini models, intensifying the cloud AI arms race

wpnews · 30 Jul · #artificial-intelligence

Microsoft Will Soon Release an AI Super App

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required