{"slug": "chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026", "title": "ChromaDB vs Qdrant vs Weaviate vs pgvector: vector database shootout 2026", "summary": "A developer benchmarked ChromaDB, Qdrant, Weaviate, and pgvector for RAG pipelines, finding that most teams over-optimize for future scale while underestimating day-one operational costs. ChromaDB offers the fastest setup with no server or schema required but degrades beyond 2–5 million vectors due to post-search metadata filtering, while Qdrant applies filters before ANN search and supports native hybrid search, making it suitable for production RAG pipelines and multi-tenant datasets above 5 million vectors.", "body_md": "Every RAG pipeline I've reviewed this year hits the same decision point: which vector store do you actually ship? The wrong choice compounds — it shapes your architecture, your operational overhead, and how painful a future migration will be. I've run all four of these in production or near-production contexts. Here's what actually matters for the decision.\n\nBefore benchmarking anything, answer these:\n\nMost teams over-optimize for a scale they won't reach for 18 months and under-weight the day-one operational cost of a new infrastructure component.\n\nChromaDB requires no server, no Docker, no schema definition upfront. It's embedded in Python, and you can have a working vector store in a few lines:\n\n``` python\nimport chromadb\nfrom chromadb.utils import embedding_functions\n\nclient = chromadb.PersistentClient(path=\"./chroma_db\")\nef = embedding_functions.SentenceTransformerEmbeddingFunction(\n    model_name=\"all-MiniLM-L6-v2\"\n)\n\ncollection = client.get_or_create_collection(\n    name=\"documents\",\n    embedding_function=ef,\n    metadata={\"hnsw:space\": \"cosine\"}\n)\n\ncollection.add(\n    documents=[\n        \"FastAPI is great for building REST APIs\",\n        \"Go outperforms Python on CPU-bound tasks\",\n        \"Vector databases enable semantic search at scale\",\n    ],\n    ids=[\"doc1\", \"doc2\", \"doc3\"],\n    metadatas=[\n        {\"source\": \"blog\", \"year\": 2026},\n        {\"source\": \"blog\", \"year\": 2025},\n        {\"source\": \"docs\", \"year\": 2026},\n    ]\n)\n\nresults = collection.query(\n    query_texts=[\"which backend language is fastest?\"],\n    n_results=2,\n    where={\"year\": {\"$gte\": 2025}}\n)\nprint(results[\"documents\"])\n```\n\nThe critical limitation: ChromaDB applies metadata filters *after* the ANN search. It over-fetches internally to compensate, which degrades recall correctness at scale. Its distributed mode remains underdeveloped as of mid-2026. Scale ceiling is roughly 2–5M vectors before you start noticing.\n\n**Best for**: local dev, internal tools, demos, early-stage products.\n\nQdrant is written in Rust and applies payload filters before the ANN search — the technically correct behavior. This matters when you have multi-tenant data or narrow filter conditions. A filter applied post-search means you're doing extra work and getting non-deterministic recall when the filtered result set is smaller than your requested `top_k`\n\n.\n\n``` python\nfrom qdrant_client import QdrantClient\nfrom qdrant_client.models import (\n    Distance, VectorParams, PointStruct,\n    Filter, FieldCondition, MatchValue\n)\n\nclient = QdrantClient(url=\"http://localhost:6333\")\n\nclient.recreate_collection(\n    collection_name=\"documents\",\n    vectors_config=VectorParams(size=384, distance=Distance.COSINE),\n)\n\npoints = [\n    PointStruct(\n        id=1,\n        vector=[0.05] * 384,  # replace with real embeddings\n        payload={\"source\": \"blog\", \"year\": 2026, \"text\": \"FastAPI for REST APIs\"}\n    ),\n    PointStruct(\n        id=2,\n        vector=[0.12] * 384,\n        payload={\"source\": \"docs\", \"year\": 2025, \"text\": \"HNSW index internals\"}\n    ),\n]\nclient.upsert(collection_name=\"documents\", points=points)\n\nresults = client.search(\n    collection_name=\"documents\",\n    query_vector=[0.08] * 384,\n    query_filter=Filter(\n        must=[FieldCondition(key=\"year\", match=MatchValue(value=2026))]\n    ),\n    limit=5\n)\nfor r in results:\n    print(r.payload[\"text\"], round(r.score, 4))\n```\n\nQdrant also supports sparse + dense hybrid search natively, which is useful when you want BM25 recall blended with semantic similarity — a common pattern for RAG over heterogeneous corpora. It handles concurrent writes well, exposes both REST and gRPC, and its Python SDK is actively maintained. The managed cloud tier is straightforward to size.\n\n**Best for**: production RAG pipelines, multi-tenant SaaS, datasets above 5M vectors.\n\nWeaviate offers the largest feature set in this list: GraphQL querying, multi-tenancy, built-in hybrid search, modules for text and images, and a schema-based data model. If you genuinely need multi-modal search or a GraphQL interface over your vector data, it's the only option here that delivers it cleanly.\n\nThe operational cost is real. Weaviate ships frequent releases and requires careful memory tuning on self-hosted deployments. Its schema-first approach adds friction during the exploration phase when your embedding model is still changing. The managed tier (Weaviate Cloud) is generous at small scale but cost climbs fast past 1M objects.\n\nIt's also the most complex to reason about internally: its ANN implementation is HNSW, and it layers BM25 on top for hybrid search. When things behave unexpectedly, the debugging surface is wide.\n\n**Best for**: product search with image embeddings, teams that need GraphQL, complex multi-modal use cases.\n\nIf your application already runs on Postgres, pgvector eliminates an entire infrastructure dependency. Version 0.5 added HNSW index support, which closed most of the performance gap with dedicated solutions at moderate scale.\n\n``` python\nimport psycopg2\nimport numpy as np\n\nconn = psycopg2.connect(\"dbname=mydb user=postgres host=localhost\")\ncur = conn.cursor()\n\ncur.execute(\"CREATE EXTENSION IF NOT EXISTS vector\")\ncur.execute(\"\"\"\n    CREATE TABLE IF NOT EXISTS documents (\n        id SERIAL PRIMARY KEY,\n        content TEXT,\n        source TEXT,\n        year INT,\n        embedding vector(384)\n    )\n\"\"\")\ncur.execute(\n    \"CREATE INDEX IF NOT EXISTS idx_doc_embedding \"\n    \"ON documents USING hnsw (embedding vector_cosine_ops)\"\n)\nconn.commit()\n\nembedding = np.random.rand(384).tolist()\ncur.execute(\n    \"INSERT INTO documents (content, source, year, embedding) VALUES (%s, %s, %s, %s)\",\n    (\"pgvector HNSW makes semantic search viable in Postgres\", \"blog\", 2026, embedding)\n)\nconn.commit()\n\nquery_vec = np.random.rand(384).tolist()\ncur.execute(\"\"\"\n    SELECT content, 1 - (embedding <=> %s::vector) AS similarity\n    FROM documents\n    WHERE year >= 2025\n    ORDER BY embedding <=> %s::vector\n    LIMIT 5\n\"\"\", (query_vec, query_vec))\n\nfor row in cur.fetchall():\n    print(f\"{row[0]} -- similarity: {round(row[1], 4)}\")\n\ncur.close()\nconn.close()\n```\n\nYour existing Postgres tooling — backups, monitoring, migrations, access control — carries over. No new service to operate, no new runbook to write. The tradeoffs: no native hybrid search yet (you can approximate with `tsvector`\n\n+ cosine distance, but it's glue code), HNSW index builds are slower than Qdrant's, and at 10M+ vectors with high QPS, dedicated hardware starts to matter.\n\n**Best for**: teams already on Postgres, datasets under 5M vectors, early-to-mid-stage RAG where operational simplicity matters.\n\n| ChromaDB | Qdrant | Weaviate | pgvector | |\n|---|---|---|---|---|\n| Setup | Embedded | Docker / Cloud | Docker / Cloud | PG Extension |\n| Pre-filter ANN | No | Yes | Yes | Partial |\n| Hybrid search | No | Yes | Yes | No |\n| Scale ceiling | ~5M | 100M+ | 50M+ | ~10M |\n| Operational cost | Very low | Low | High | Low (on PG) |\n| Managed option | No | Yes | Yes | Via PG providers |\n\nDefault path: start with **ChromaDB** to ship fast, migrate to **Qdrant** when you need pre-filter correctness or hit scale, use **pgvector** if you're already on Postgres and your dataset stays under a few million vectors. Reach for **Weaviate** only when you specifically need its feature set.\n\nThe biggest mistake I see is teams optimizing for a scale they won't reach for 18 months while ignoring the operational burden of a new database they'll feel on day one. Pick the simplest option that fits your actual current requirements, and design a migration path before you need it.\n\nFor regulated deployments — healthcare, finance, government — verify encryption-at-rest guarantees and data residency options for each managed offering before committing. We track the right questions to ask in our [security evaluation checklists](https://ayinedjimi-consultants.fr/checklists).\n\n*I run AYI NEDJIMI Consultants, a cybersecurity consulting firm. We publish free security hardening checklists — PDF and Excel.*", "url": "https://wpnews.pro/news/chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026", "canonical_source": "https://dev.to/ayinedjimi-consultants/chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026-14n7", "published_at": "2026-05-28 10:07:46+00:00", "updated_at": "2026-05-28 10:23:01.620923+00:00", "lang": "en", "topics": ["ai-infrastructure", "ai-tools", "machine-learning", "large-language-models", "mlops"], "entities": ["ChromaDB", "Qdrant", "Weaviate", "pgvector", "FastAPI", "Go", "Python", "SentenceTransformer"], "alternates": {"html": "https://wpnews.pro/news/chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026", "markdown": "https://wpnews.pro/news/chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026.md", "text": "https://wpnews.pro/news/chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026.txt", "jsonld": "https://wpnews.pro/news/chromadb-vs-qdrant-vs-weaviate-vs-pgvector-vector-database-shootout-2026.jsonld"}}