Tool use

Tool use is how an agent acts. The model doesn't execute anything itself, it requests tool calls, the harness runs them, and the results become new input. Getting the mechanics right is half the battle.

The contract

You define a tool with three things:

  1. Name, a short identifier (web_search, read_file).
  2. Description, what the tool does, when to use it, when not to.
  3. Input schema. JSON schema describing arguments.

The model decides when to call a tool based on the description. Description quality matters more than the name. A mediocre description = a tool the model won't use or will misuse.

The loop

1. Send messages + tool definitions to the model
2. Model responds with either:
   - A final answer (done)
   - A tool_use block (model wants to call a tool)
3. Harness executes the tool, captures output
4. Harness sends back a tool_result referencing the tool_use ID
5. Model produces the next step (more tool calls or final answer)
6. Repeat until final answer or max_turns reached

Parallel tool calls

Modern Claude can request multiple tools in one turn. Your harness should execute them in parallel when possible:

// Model emits:
[tool_use: search("MCP spec")]
[tool_use: search("MCP vs API")]
[tool_use: fetch_url("https://modelcontextprotocol.io")]

// Harness runs all three concurrently,
// returns all three results in the next turn.

Parallel execution cuts latency dramatically on multi-search or multi-read tasks. Don't serialize by default.

Tool errors

Tools fail. The network times out. The file doesn't exist. Credentials expire. Your tool result should include error info in a shape the model can reason about:

{
  "status": "error",
  "error_type": "not_found",
  "message": "File /path/to/file.txt does not exist",
  "suggestion": "Check the path; list the directory first."
}

The model can read this and take a smarter next step (list the directory). If you return a raw stack trace, the model will often just retry the same bad call.

Input schemas that actually work

Common failure modes

Insight: Treat tool definitions like microservices. The description is the API docs. The input schema is the contract. Write them with the same care.