ChromaDB vs Qdrant vs Weaviate vs pgvector: vector database shootout 2026

wpnews.pro

Every RAG pipeline I've reviewed this year hits the same decision point: which vector store do you actually ship? The wrong choice compounds — it shapes your architecture, your operational overhead, and how painful a future migration will be. I've run all four of these in production or near-production contexts. Here's what actually matters for the decision.

Before benchmarking anything, answer these:

Most teams over-optimize for a scale they won't reach for 18 months and under-weight the day-one operational cost of a new infrastructure component.

ChromaDB requires no server, no Docker, no schema definition upfront. It's embedded in Python, and you can have a working vector store in a few lines:

import chromadb
from chromadb.utils import embedding_functions

client = chromadb.PersistentClient(path="./chroma_db")
ef = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)

collection = client.get_or_create_collection(
    name="documents",
    embedding_function=ef,
    metadata={"hnsw:space": "cosine"}
)

collection.add(
    documents=[
        "FastAPI is great for building REST APIs",
        "Go outperforms Python on CPU-bound tasks",
        "Vector databases enable semantic search at scale",
    ],
    ids=["doc1", "doc2", "doc3"],
    metadatas=[
        {"source": "blog", "year": 2026},
        {"source": "blog", "year": 2025},
        {"source": "docs", "year": 2026},
    ]
)

results = collection.query(
    query_texts=["which backend language is fastest?"],
    n_results=2,
    where={"year": {"$gte": 2025}}
)
print(results["documents"])

The critical limitation: ChromaDB applies metadata filters after the ANN search. It over-fetches internally to compensate, which degrades recall correctness at scale. Its distributed mode remains underdeveloped as of mid-2026. Scale ceiling is roughly 2–5M vectors before you start noticing.

Best for: local dev, internal tools, demos, early-stage products.

Qdrant is written in Rust and applies payload filters before the ANN search — the technically correct behavior. This matters when you have multi-tenant data or narrow filter conditions. A filter applied post-search means you're doing extra work and getting non-deterministic recall when the filtered result set is smaller than your requested top_k

.

from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance, VectorParams, PointStruct,
    Filter, FieldCondition, MatchValue
)

client = QdrantClient(url="http://localhost:6333")

client.recreate_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)

points = [
    PointStruct(
        id=1,
        vector=[0.05] * 384,  # replace with real embeddings
        payload={"source": "blog", "year": 2026, "text": "FastAPI for REST APIs"}
    ),
    PointStruct(
        id=2,
        vector=[0.12] * 384,
        payload={"source": "docs", "year": 2025, "text": "HNSW index internals"}
    ),
]
client.upsert(collection_name="documents", points=points)

results = client.search(
    collection_name="documents",
    query_vector=[0.08] * 384,
    query_filter=Filter(
        must=[FieldCondition(key="year", match=MatchValue(value=2026))]
    ),
    limit=5
)
for r in results:
    print(r.payload["text"], round(r.score, 4))

Qdrant also supports sparse + dense hybrid search natively, which is useful when you want BM25 recall blended with semantic similarity — a common pattern for RAG over heterogeneous corpora. It handles concurrent writes well, exposes both REST and gRPC, and its Python SDK is actively maintained. The managed cloud tier is straightforward to size.

Best for: production RAG pipelines, multi-tenant SaaS, datasets above 5M vectors.

Weaviate offers the largest feature set in this list: GraphQL querying, multi-tenancy, built-in hybrid search, modules for text and images, and a schema-based data model. If you genuinely need multi-modal search or a GraphQL interface over your vector data, it's the only option here that delivers it cleanly.

The operational cost is real. Weaviate ships frequent releases and requires careful memory tuning on self-hosted deployments. Its schema-first approach adds friction during the exploration phase when your embedding model is still changing. The managed tier (Weaviate Cloud) is generous at small scale but cost climbs fast past 1M objects.

It's also the most complex to reason about internally: its ANN implementation is HNSW, and it layers BM25 on top for hybrid search. When things behave unexpectedly, the debugging surface is wide.

Best for: product search with image embeddings, teams that need GraphQL, complex multi-modal use cases.

If your application already runs on Postgres, pgvector eliminates an entire infrastructure dependency. Version 0.5 added HNSW index support, which closed most of the performance gap with dedicated solutions at moderate scale.

import psycopg2
import numpy as np

conn = psycopg2.connect("dbname=mydb user=postgres host=localhost")
cur = conn.cursor()

cur.execute("CREATE EXTENSION IF NOT EXISTS vector")
cur.execute("""
    CREATE TABLE IF NOT EXISTS documents (
        id SERIAL PRIMARY KEY,
        content TEXT,
        source TEXT,
        year INT,
        embedding vector(384)
    )
""")
cur.execute(
    "CREATE INDEX IF NOT EXISTS idx_doc_embedding "
    "ON documents USING hnsw (embedding vector_cosine_ops)"
)
conn.commit()

embedding = np.random.rand(384).tolist()
cur.execute(
    "INSERT INTO documents (content, source, year, embedding) VALUES (%s, %s, %s, %s)",
    ("pgvector HNSW makes semantic search viable in Postgres", "blog", 2026, embedding)
)
conn.commit()

query_vec = np.random.rand(384).tolist()
cur.execute("""
    SELECT content, 1 - (embedding <=> %s::vector) AS similarity
    FROM documents
    WHERE year >= 2025
    ORDER BY embedding <=> %s::vector
    LIMIT 5
""", (query_vec, query_vec))

for row in cur.fetchall():
    print(f"{row[0]} -- similarity: {round(row[1], 4)}")

cur.close()
conn.close()

Your existing Postgres tooling — backups, monitoring, migrations, access control — carries over. No new service to operate, no new runbook to write. The tradeoffs: no native hybrid search yet (you can approximate with tsvector

cosine distance, but it's glue code), HNSW index builds are slower than Qdrant's, and at 10M+ vectors with high QPS, dedicated hardware starts to matter.

Best for: teams already on Postgres, datasets under 5M vectors, early-to-mid-stage RAG where operational simplicity matters.

ChromaDB	Qdrant	Weaviate	pgvector
Setup	Embedded	Docker / Cloud	Docker / Cloud	PG Extension
Pre-filter ANN	No	Yes	Yes	Partial
Hybrid search	No	Yes	Yes	No
Scale ceiling	~5M	100M+	50M+	~10M
Operational cost	Very low	Low	High	Low (on PG)
Managed option	No	Yes	Yes	Via PG providers

Default path: start with ChromaDB to ship fast, migrate to Qdrant when you need pre-filter correctness or hit scale, use pgvector if you're already on Postgres and your dataset stays under a few million vectors. Reach for Weaviate only when you specifically need its feature set.

The biggest mistake I see is teams optimizing for a scale they won't reach for 18 months while ignoring the operational burden of a new database they'll feel on day one. Pick the simplest option that fits your actual current requirements, and design a migration path before you need it.

For regulated deployments — healthcare, finance, government — verify encryption-at-rest guarantees and data residency options for each managed offering before committing. We track the right questions to ask in our security evaluation checklists.

I run AYI NEDJIMI Consultants, a cybersecurity consulting firm. We publish free security hardening checklists — PDF and Excel.

source & further reading

dev.to — original article Claude Code's Trust Problem: A Wave of Model and Routing Complaints Hit GitHub 7 things I learned trying to stop LLM API bills from silently exploding Passion Atlas: A Living Map of Human Curiosity

ChromaDB vs Qdrant vs Weaviate vs pgvector: vector database shootout 2026

Run your AI side-project on zahid.host