Reranking

A second pass that reorders search results by how well they actually answer the question.

Explained simply.

The first step of RAG retrieval casts a wide net - pulls back maybe 20-50 potentially-relevant chunks, fast. Reranking is a smarter-but-slower second pass: a specialized model reads the query AND each candidate chunk together, then scores how well that chunk actually answers the query. The top 3-5 get passed to the LLM. This two-step approach gets much better results than either step alone.

An example.

You ask your company bot 'how do I request time off?'. Initial retrieval returns 20 HR docs. A reranker reads all 20 against your question and promotes the PTO-request policy page to #1 - even though the retrieval step had it at #7 based on embedding similarity alone.

Why it matters.

Reranking is the single fastest way to improve a mediocre RAG system. Cohere, Voyage AI, and others sell rerankers as standalone APIs. Adding a reranker usually gives a 10-30% boost in answer quality.

Reranking

Explained simply.

An example.

Why it matters.

Related terms

Further reading