Context Window

The AI's working memory. Everything it can 'see' and 'think about' at once.

Explained simply.

Imagine the model is reading a book out loud to answer your question, but it can only hold a certain number of pages in its hands at one time. The context window is how many pages those are. It includes your question, the conversation so far, any documents you pasted in, the AI's instructions, and the answer it's generating. When the window is full, the oldest pages get put down and forgotten. The AI does not remember anything that falls off the edge of the window unless you tell it again.

An example.

You paste a 50-page PDF into Claude and ask a question. That PDF plus your question plus Claude's answer all have to fit inside the context window. If you then paste ANOTHER 50-page PDF, the first one might start getting squeezed out. Ask about page 3 of the first PDF later and Claude might not remember it - not because it was a bad model, but because the page is no longer in its hands.

Why it matters.

Bigger window = you can feed it more without it losing track. Claude Sonnet holds 1 million tokens (roughly 750,000 words, about 10 full novels). Older models held 4,000 tokens (about 3,000 words, 10 pages). When an agent starts acting weird after a long session, the usual reason is that the important context has fallen off the window. Fix it by summarizing, chunking, or starting fresh.

Context Window

Explained simply.

An example.

Why it matters.

Related terms

Further reading

Watch