Cutting long documents into smaller pieces so the AI can work with them.
A 300-page manual won't fit in a prompt, and even if it did, asking the model to reason over it all at once gives bad answers. Chunking is the practice of splitting documents into smaller, self-contained pieces (usually 200-800 words each) before you store them. Each chunk gets its own embedding. When a question comes in, you retrieve the best matching CHUNKS, not whole documents.
Bad chunking is the single biggest reason RAG systems give bad answers. Chunks too small = missing context. Too big = irrelevant noise overwhelms the real answer. Getting this right matters more than which model or database you pick.