Six months ago I moved my RAG pipeline from Pinecone to self-hosted Qdrant. My vector search bill went from $210/month to $6.50/month. Same latency. Same recall. Here's exactly how.
The Setup
My app does document Q&A for legal contracts. The numbers:
5.2 million vectors (1536-dim, OpenAI embeddings) ~800K queries/month
P99 latency requirement: < 50ms
On Pinecone Serverless, this cost me roughly $210/month β storage plus read units plus write units for daily ingestion of new documents.
What I Moved To
A single Hetzner CX32 server:
4 vCPU, 8 GB RAM, 80 GB SSD
β¬8.50/month (about $9.20) Qdrant running in Docker
Automated daily backups to S3-compatible storage ($0.50/month)
Total: ~$10/month. That's a 95% cost reduction.
The Migration Was Easier Than Expected
bash# Export from Pinecone (I used their scroll API)
python export_pinecone.py --index legal-docs --output vectors.jsonl
docker run -d -p 6333:6333 -v ./storage:/qdrant/storage qdrant/qdrant
python import_qdrant.py --input vectors.jsonl --collection legal-docs
The whole migration took an afternoon. The Qdrant Python client is straightforward, and the API is surprisingly similar to Pinecone's.
Performance Comparison
I ran the same 10,000 test queries against both setups:
MetricPinecone ServerlessQdrant Self-HostedP50 latency23ms4msP99 latency89ms12msRecall@100.970.97Monthly cost$210$10
The self-hosted Qdrant is actually faster because the data sits in memory on the same machine. Pinecone Serverless loads data from object storage on demand, which adds cold-start latency.
When Self-Hosting Is a Bad Idea
I want to be honest about the trade-offs:
Don't self-host if: You have zero DevOps experience and no one on the team does
You need 99.99% uptime SLA for enterprise customers
Your vector count is growing unpredictably (10M one month, 100M the next)
You're a team of 1-2 and every hour on infra is an hour not building product
Do self-host if: Your scale is predictable (you know roughly how many vectors you'll have)
You're comfortable with Docker and basic server management
Cost matters β the difference between $10 and $210 is $2,400/year
You want full control over your data and indexing parameters
The Cost at Every Scale
I built a calculator to compare all four major vector DBs at different scales:
ScalePineconeQdrant CloudQdrant Self-HostedSupabase pgvector1M vectors~$22/mo~$14/mo~$7/mo~$27/mo10M vectors~$210/mo~$120/mo~$72/mo~$95/mo100M vectors~$1,900/mo~$950/mo~$480/moN/A
π Calculate your exact cost
One Thing I Miss About Pinecone
The dashboard. Pinecone's web console lets you browse vectors, run test queries, and see index stats visually. With self-hosted Qdrant, I'm using curl and Python scripts. There's a Qdrant Web UI but it's basic.
Would I go back? At $200/month savings, absolutely not. But if I were building a quick prototype and didn't want to think about infrastructure, Pinecone's free tier (100K vectors) is genuinely good for getting started.
Running self-hosted vector search? I'd love to hear your setup and costs. Also built comparison pages for specific matchups: Pinecone vs Qdrant, Supabase vs Pinecone.