Home›Expertise›RAGS to Riches›Choosing a vector DB

Choosing a vector DB

📖 5 min readUpdated 2026-04-18

The vector DB decision gets over-engineered. Most of the time the right answer is "the one your team can operate" plus "it supports hybrid search and metadata filtering." Everything else is second-order. Here's how I actually pick.

The decision matrix

Prototype / < 1M vectors / low traffic

Chroma: embeddable, zero-config. Great for notebooks and demos.
pgvector: if you already have Postgres. Simple, fast enough.
LanceDB: for local-first or desktop apps.

Don't pay for a managed vector DB at this scale. It's wasted money.

Production, small to medium (1M-10M vectors)

pgvector: still viable. Combined with proper HNSW indexes, handles this range well.
Qdrant: self-hosted, excellent performance, open-source. My usual pick for production at this size.
Pinecone: if you want fully managed with low ops burden.
Weaviate: if hybrid search and rich filtering are load-bearing features.

Production, medium to large (10M-100M vectors)

Qdrant: scales well, excellent filtering performance.
Pinecone: costs get real but operational simplicity is worth it for some teams.
Weaviate: good choice, especially for hybrid search.
Turbopuffer: cheapest at this scale if you can accept its cold-start characteristics.

Very large (100M+ vectors)

Milvus: designed for this scale. Complex to operate.
Vespa: battle-tested at Yahoo scale.
Qdrant (cluster mode): now viable for this scale.
Pinecone: works, expensive. Reserve for when your team's engineering cost outweighs DB cost.

Already running Elasticsearch / OpenSearch

Use their vector support. The integration with your existing BM25 + filtering infrastructure is worth more than a slightly better dedicated vector DB.

The features that actually matter

Must-have

Hybrid search (dense + sparse), either native or via an integration
Metadata filtering with indexed fields
Upsert semantics (not just insert)
Bulk insert performance (matters at ingestion time)
Quantization support (becomes important at scale)

Nice-to-have

Multiple tenants / namespaces
Reranking integration (some DBs can call rerankers natively)
Multi-vector support (ColBERT-style late interaction)
Streaming updates for real-time indexes
Geographic replication

Often oversold

"Billions of vectors", most teams never hit this scale
"Sub-millisecond latency", end-to-end RAG latency is dominated by generation, not retrieval
"Serverless", can be cost-effective, but some workloads don't benefit

Cost comparison (2026 approximate)

For 10M vectors, 1024 dim, 100K queries/month:

Pinecone serverless: ~$300-600/month
Qdrant self-hosted on a modest server: ~$100-200/month in cloud costs
pgvector on existing Postgres: marginal cost
Weaviate Cloud: ~$400-800/month
Turbopuffer: ~$50-200/month

Self-hosted wins on cost at scale. Managed wins on operational simplicity.

Lock-in considerations

Migrating vector DBs is moderately painful but not catastrophic:

Re-index is the big cost. If you can keep embeddings, migration is data movement.
Metadata schema differences require mapping.
Query API differences require code changes.

Build an abstraction layer over your vector DB calls. Don't scatter provider-specific code across your app. A simple repository/interface pattern saves weeks when you eventually migrate.

The trap to avoid

Don't pick a vector DB based on a benchmark blog post. Every vendor publishes benchmarks that show them winning. The meaningful questions are:

Can your team operate it?
Does it support your required features?
Is the cost sustainable at your projected scale?
Does it integrate with your stack?

The performance differences between top-tier vector DBs at reasonable scale are usually less than the differences between chunking strategies. Pick a reasonable DB and move on.

My current defaults

Prototype: Chroma or pgvector
Production, self-hosted leaning: Qdrant
Production, managed leaning: Pinecone
Production with heavy hybrid search needs: Weaviate
Already in AWS, already use Postgres: pgvector on RDS
Already running ES/OS: their vector support

What to do with this

Put an abstraction layer over your vector DB calls from day one.
Pick on team fit + features, not benchmarks. Every vendor has a winning benchmark somewhere.
Start with the cheapest option that meets your requirements; migrate later if needed.