Choosing a vector DB
📖 5 min readUpdated 2026-04-18
The vector DB decision gets over-engineered. Most of the time the right answer is "the one your team can operate" plus "it supports hybrid search and metadata filtering." Everything else is second-order. Here's how I actually pick.
The decision matrix
Prototype / < 1M vectors / low traffic
- Chroma: embeddable, zero-config. Great for notebooks and demos.
- pgvector: if you already have Postgres. Simple, fast enough.
- LanceDB: for local-first or desktop apps.
Don't pay for a managed vector DB at this scale. It's wasted money.
Production, small to medium (1M-10M vectors)
- pgvector: still viable. Combined with proper HNSW indexes, handles this range well.
- Qdrant: self-hosted, excellent performance, open-source. My usual pick for production at this size.
- Pinecone: if you want fully managed with low ops burden.
- Weaviate: if hybrid search and rich filtering are load-bearing features.
Production, medium to large (10M-100M vectors)
- Qdrant: scales well, excellent filtering performance.
- Pinecone: costs get real but operational simplicity is worth it for some teams.
- Weaviate: good choice, especially for hybrid search.
- Turbopuffer: cheapest at this scale if you can accept its cold-start characteristics.
Very large (100M+ vectors)
- Milvus: designed for this scale. Complex to operate.
- Vespa: battle-tested at Yahoo scale.
- Qdrant (cluster mode): now viable for this scale.
- Pinecone: works, expensive. Reserve for when your team's engineering cost outweighs DB cost.
Already running Elasticsearch / OpenSearch
Use their vector support. The integration with your existing BM25 + filtering infrastructure is worth more than a slightly better dedicated vector DB.
The features that actually matter
Must-have
- Hybrid search (dense + sparse), either native or via an integration
- Metadata filtering with indexed fields
- Upsert semantics (not just insert)
- Bulk insert performance (matters at ingestion time)
- Quantization support (becomes important at scale)
Nice-to-have
- Multiple tenants / namespaces
- Reranking integration (some DBs can call rerankers natively)
- Multi-vector support (ColBERT-style late interaction)
- Streaming updates for real-time indexes
- Geographic replication
Often oversold
- "Billions of vectors", most teams never hit this scale
- "Sub-millisecond latency", end-to-end RAG latency is dominated by generation, not retrieval
- "Serverless", can be cost-effective, but some workloads don't benefit
Cost comparison (2026 approximate)
For 10M vectors, 1024 dim, 100K queries/month:
- Pinecone serverless: ~$300-600/month
- Qdrant self-hosted on a modest server: ~$100-200/month in cloud costs
- pgvector on existing Postgres: marginal cost
- Weaviate Cloud: ~$400-800/month
- Turbopuffer: ~$50-200/month
Self-hosted wins on cost at scale. Managed wins on operational simplicity.
Lock-in considerations
Migrating vector DBs is moderately painful but not catastrophic:
- Re-index is the big cost. If you can keep embeddings, migration is data movement.
- Metadata schema differences require mapping.
- Query API differences require code changes.
Build an abstraction layer over your vector DB calls. Don't scatter provider-specific code across your app. A simple repository/interface pattern saves weeks when you eventually migrate.
The trap to avoid
Don't pick a vector DB based on a benchmark blog post. Every vendor publishes benchmarks that show them winning. The meaningful questions are:
- Can your team operate it?
- Does it support your required features?
- Is the cost sustainable at your projected scale?
- Does it integrate with your stack?
The performance differences between top-tier vector DBs at reasonable scale are usually less than the differences between chunking strategies. Pick a reasonable DB and move on.
My current defaults
- Prototype: Chroma or pgvector
- Production, self-hosted leaning: Qdrant
- Production, managed leaning: Pinecone
- Production with heavy hybrid search needs: Weaviate
- Already in AWS, already use Postgres: pgvector on RDS
- Already running ES/OS: their vector support
Next: Vector similarity search.