Home›Framework›Claude›Claude, overview

Claude, overview

📖 4 min readUpdated 2026-04-18

Claude is the model at the center of everything else on this site. Before you read about MCP or Claude Code or agent patterns, it helps to understand what Claude actually is, what makes it a good choice for agent work, and how to pick between the different Claude variants. This page is the short answer, with enough detail to make the rest of the framework make sense.

What Claude is.

Claude is a family of large language models made by Anthropic. "Large language model" is a long phrase for what it is: software that, given a prompt, continues the text in a way that looks like what a smart human would write. That's the whole mechanism. Everything clever Claude does, answering questions, writing code, calling tools, reasoning through problems, is built on top of "predict the next likely piece of text."

What makes Claude specifically useful for autonomous AI is that Anthropic built it with agents in mind. Other LLMs were mostly designed for chat and bolted on tool use later. Claude has tool use, long context, and careful reasoning as first-class design goals from the start.

The Claude family: Opus, Sonnet, Haiku.

At any given generation, Claude ships as three tiers. They're the same model family with different sizes and speeds. Think of them like Large / Medium / Small.

~ the Claude family ~

The practical rule: default to Sonnet. It's a great general-purpose model for agent work. Reach for Opus when you're hitting the ceiling on hard reasoning. Drop to Haiku when you need to run many calls in parallel for worker tasks and the complexity is low.

A common pattern in multi-agent setups is "Opus plans, Haiku executes." The smart model makes a plan, then spawns Haiku sub-agents to carry out the easy steps in parallel. You get the quality of Opus and the speed/cost of Haiku.

What makes Claude good for agents specifically.

~ Claude's agent-specific strengths ~

Tool use is the big one. Agents live or die by how reliably they can call a sequence of tools across many turns. Claude is consistently the most stable in long agent loops; fewer dropped calls, fewer weird tool-argument mistakes.

System-prompt following matters because autonomous agents depend on you being able to put rules in the prompt and have them actually persist. Claude is noticeably better at keeping a complex set of rules in mind across a long conversation.

Long context means the window can hold a lot at once. At 1M tokens, you can fit an entire medium-sized codebase, a long session's worth of tool results, or a big document set. This changes what's practical.

Extended thinking lets the model reason privately before it answers. You give it a budget ("think for up to 10k tokens before responding") and it uses the space to work through the problem. See Extended thinking for when to use it.

Prompt caching makes long-running agents economically viable. When your system prompt is big (lots of tool descriptions, rules, context), each request normally pays for all of that. With caching, you pay once and get ~90% off for subsequent requests that share the prefix. See Prompt caching.

Safety tuning means Claude is trained to refuse clearly harmful tool calls even when the user asks for them, to flag prompt-injection attempts in tool output, and to be careful with irreversible actions. It's not bulletproof, but it's one more layer.

How Claude compares to the other big models.

Honest take, without the vendor pitch:

Pure language tasks (writing, translation, summary), GPT-4, Claude, and Gemini are all comparable in 2026. Pick on price and latency.
Agent work, Claude has a consistent edge on multi-turn tool-use stability and complex system-prompt adherence. Not huge, but noticeable when you're shipping.
Code, Claude and GPT-4 trade blows. For agentic coding (inside Claude Code, Cursor, etc.), Claude is usually the better pick.
Image reasoning, Gemini is often ahead; Claude and GPT-4 are comparable.
Cost, varies by variant. Haiku is cost-competitive with small OpenAI and Gemini models.

Vendor lock-in is a real thing, though. If you're building for the long term, prefer abstractions (OpenRouter, LiteLLM, or your own thin wrapper) that let you swap providers without rewriting your agent. You gain resilience and negotiation leverage without giving up the ability to use the best model per task.

Three ways you actually use Claude.

You'll encounter Claude through three different surfaces, and it's worth knowing when to use which:

The Claude API, direct programmatic access. This is what you use when you're building your own custom agent from scratch. Most flexible, most work.
Claude.ai, the web/desktop chat interface with MCP connectors. Good for exploration, non-technical users, and personal productivity. Not the right choice for production agents.
Claude Code, the CLI agent harness. Anthropic's opinionated wrapper around the API that ships permissions, hooks, skills, MCP support, and memory. For most agent builders this is the right starting point. See Claude Code overview.

Pricing, at a glance.

Claude is priced per token, input and output counted separately. A few things to know:

Output costs ~5× input. The model pays more for what it generates than for what you send it.
Extended thinking counts as output. A reasoning budget of 10k tokens costs what 10k output tokens cost. Worth it on hard problems; wasted on easy ones.
Prompt caching gets you ~90% off cached portions. Huge for agents with stable system prompts.
Opus > Sonnet > Haiku on price, roughly 5× per tier. Opus is expensive; Haiku is cheap enough to run in parallel.

For a well-designed agent with caching + the right model per task, the economics usually work out much better than you'd expect. The raw per-token prices look high; once you account for caching and tier selection, a typical daily-running agent costs single-digit dollars a month.

Where to go from here.

Prompting for agents, how prompting changes when the model's running in a loop with tools
Extended thinking, when to turn on the reasoning budget and when not to
Tool use, the mechanics of how tool calls actually work