Home›Expertise›RAGS to Riches›Metadata filtering

Metadata filtering

📖 4 min readUpdated 2026-04-18

Metadata filtering combines vector similarity with structured predicates: "find chunks similar to this query, but only from documents owned by tenant X, published after date Y, with visibility = public." Every production RAG system needs this. The performance characteristics differ dramatically between vector databases.

The query shape

query: vector(embedding of user question)
filter: {
  tenant_id: "acme",
  visibility: {$in: ["public", "internal"]},
  updated_at: {$gte: "2024-01-01"},
  document_type: "policy"
}
top_k: 10

The database's job: find the 10 nearest neighbors that also satisfy all filter conditions. Sounds simple, isn't.

Three strategies

Pre-filter

Narrow the candidate set to matching metadata first, then search only those. Exact: returns only filter-matching documents.

Fast when filter is selective (narrows to small subset)
Slow when filter matches most documents (still scans everything)
Requires metadata indexes

Post-filter

Do vector search first over all documents, then filter. Fast search, but may return fewer-than-k if filter is selective.

Fast when filter is non-selective
Broken when filter is very selective, your top-10 vector results might all be filtered out, leaving empty results

Dynamic / hybrid

The database decides based on estimated filter selectivity. Modern vector DBs (Pinecone, Qdrant, Weaviate) do this automatically.

The performance failure mode

A query with a highly selective filter on a post-filter-only database:

Vector search returns top-100 candidates
Filter matches 2 of them
You wanted top-10, got 2

You can increase the search k to compensate (over-fetch), but this is wasteful and still unreliable when filters are very selective.

The right answer: a vector DB that supports pre-filtering or dynamic filtering. Or query the metadata first to get document IDs, then run a restricted vector search.

Vector DB comparison on filtering

Qdrant: excellent. Payload indexes, pre-filter and dynamic filtering, fast on complex queries.
Weaviate: very good. Native filtering during HNSW traversal.
Pinecone: adequate. Filters applied during search, performance varies by selectivity.
Milvus: good. Supports partition-based filtering and field-level filters.
pgvector: strong for filters Postgres can index well. Combine vector and SQL.
Chroma: basic. Works but not optimized for high-selectivity filters at scale.

Index the metadata fields you'll filter on

Every vector DB lets you index specific metadata fields for fast filtering. Index every field that appears in common queries. Un-indexed fields force a full scan per query.

Typical fields to index:

tenant_id
source_system
document_type
visibility or permissions
Date fields used for freshness filters
Numeric fields used for range queries

The multi-tenant special case

In multi-tenant RAG, tenant_id filter runs on every query. Options:

Single index, filter on tenant_id

One physical index, logical separation via filter. Simple. Performance depends on tenant distribution.

Namespace per tenant (Pinecone pattern)

Each tenant has its own namespace. Physical separation. No risk of cross-tenant leaks. Better performance for tenant-scoped queries.

Collection per tenant

One collection per tenant. Extreme isolation. Expensive in overhead if you have many small tenants.

See multi-tenant RAG for more detail.

Common filter patterns

Access control

permissions: {$in: [user.roles]}

Freshness boost

Filters to published_after: (today - 2 years). Older documents are still searchable if you remove the filter for broad queries.

Document type scoping

User asks "what's our refund policy", filter to document_type = "policy".

Tenant + source

tenant_id: "acme" AND source_system: {$in: ["confluence", "drive"]}

Filter-aware reranking

After retrieval, you can apply soft filters (boosts rather than hard filters) via the reranker:

Boost recent documents
Boost canonical sources over derived
Boost documents user has previously engaged with

See reranking.

What to do with this

Index every metadata field you filter on. Un-indexed fields force full scans.
For multi-tenant, use namespaces over filter-on-tenant-id where your DB supports it.
Prefer a DB with dynamic pre/post-filter selection.