Embeddings.

Turning text into vectors. Models, dimensions, trade-offs, and when to update them.

Closed vs open embedding models

OpenAI, Cohere, and Voyage run API models. Hugging Face has open-source alternatives. Here's how to pick between them for production.

Dimensions, cost, and MRL

Vector dimensions drive storage cost and search latency. Matryoshka Representation Learning lets you shrink vectors without retraining. Here's the math.

Fine-tuning embeddings

Fine-tuning an embedding model on your domain data is one of the highest-leverage moves in specialized RAG systems. Here's when it's worth it and how to do it.

Picking an embedding model

There are hundreds of embedding models. The right choice depends on domain, volume, latency, cost, and language. Here's the decision framework.

What embeddings are

Embeddings turn text into vectors where semantically similar text is close in space. Here's the mental model and what actually matters about them for RAG.