Multi-tenant RAG is what B2B SaaS companies build: one RAG system serving many customer organizations, each with their own documents, users, and permissions. The requirements are strict, no data can cross tenant boundaries, even in retrieval candidates, and the scaling profile is different from single-tenant systems.
Absolute tenant isolation:
All tenants' vectors in one index. Every query filters by tenant_id.
Pros:
Cons:
Vector DBs like Pinecone offer namespaces. Each tenant gets its own namespace within a shared index. Queries are scoped to the namespace.
Pros:
Cons:
Each tenant has a dedicated index.
Pros:
Cons:
Small tenants share an index with filters. Large tenants get dedicated indexes or namespaces.
This is what most mature multi-tenant systems end up with. Optimizes cost at the long tail, gives isolation where it matters.
Even within one tenant, not all users see all documents:
Every chunk carries both tenant_id AND user/role-level permissions. Query filters apply both.
query filter: tenant_id: [from authenticated user] permissions: [intersect with user's roles] optional: additional filters from query Never trust client-provided tenant_id. Always derive from authentication.
Multi-tenant systems usually use one embedding model across all tenants:
Per-tenant embeddings are exotic and usually not worth it. Tenant-specific fine-tuning is possible but rare.
Common approach: offer tiers.
Let tenants choose their tier.
Typical B2B has many small tenants and few large ones. Shared infrastructure with filters serves the long tail cheaply.
A few enterprise tenants may have 100x the content of average. Separate resources for them prevents them from dominating shared infrastructure.
Prevent one tenant from starving others. Rate limits at the tenant level in addition to per-user.
This should be automated. Onboarding every tenant manually doesn't scale past a few dozen.
When a tenant leaves:
GDPR's "right to erasure" may apply. Build deletion pathways from day one.
Surfaces per-tenant issues before they escalate.
Some tenants want custom behavior:
Architect for this from the start. Per-tenant configuration files, stored customization, feature flags.
For B2B RAG, regular audits confirm:
Third-party audits (SOC 2, ISO 27001) require these and will catch gaps.
Per-tenant cost attribution is essential:
Feeds into pricing decisions and lets you identify unprofitable tenants.
Multi-tenant RAG has all the challenges of single-tenant RAG plus isolation, scaling, and operational multi-tenancy concerns. The isolation concerns aren't optional, one leak is a reputational catastrophe. Build with isolation as a first-class property, not an afterthought.
Back to the RAGS to Riches overview.