ReAct

📖 4 min readUpdated 2026-04-19

ReAct stands for Reason + Act. It's the foundational agent loop: the model thinks out loud about what to do, calls a tool, reads the result, thinks again, calls another tool, and keeps going until it's done. Almost every agent you've ever used runs some flavor of ReAct. If you only learn one loop pattern, learn this one. Everything else (planning, reflection, multi-agent) is a layer on top.

Why "thinking out loud" is the trick

The original insight from the ReAct paper: when you make the model write its reasoning before it picks an action, its actions get noticeably better. The model commits to a rationale ("I need to find X, so I'll search"), then acts on that rationale. And the reasoning stays in the context, so subsequent steps build on it instead of starting fresh. Same model, same tools, bigger success rate, just because the reasoning got written down.

The loop, drawn out

What a trace actually looks like

Thought: User wants the current AAPL price. I don't have real-time data. I'll search.
Action: web_search(query="AAPL stock price today")
Observation: AAPL is $178.42 as of 3:45pm ET, up 0.8%.
Thought: I have the price. No more tools needed.
Final answer: Apple (AAPL) is trading at $178.42, up 0.8% today.

Three things to notice: the model wrote a short rationale before every action, the action had a specific tool + args, the observation came back and was referenced in the next thought. That's the whole pattern.

Stop conditions (you must set these)

A ReAct loop will happily run forever if you let it. Real production loops have multiple stop conditions:

A worked example: "What's the weather in my meeting city?"

Thought: "I need to know where the user's next meeting is." → Action: get_calendar_next_event().
Observation: "Meeting in Austin, 3pm tomorrow."
Thought: "Tomorrow's Austin weather." → Action: weather_forecast(city: "Austin", date: "tomorrow").
Observation: "86°F, light rain."
Thought: "Got it, I can answer now." → Final: "Austin will be 86°F with light rain when you're there tomorrow afternoon."

Two tool calls, four thoughts, clean answer. The model didn't know the city until step 2; it couldn't have been planned. ReAct fits exactly because the path depends on what you learn along the way.

The four failure modes

Tool hallucination. Model calls a tool that doesn't exist, or passes an argument schema that's wrong. Fix: tight tool schemas + validation before execution.
Premature answer. Model stops exploring and guesses. Fix: system prompt should say "call tools when uncertain" and give examples.
Infinite loop. Same tool, same args, over and over. Fix: detect duplicate consecutive actions, halt with an error.
Lost-in-context. At turn 15 the context is huge and the model forgets what it was doing. Fix: summarize old observations every N steps, or switch to a planning loop for long tasks.

ReAct variants you'll meet

Plan-then-execute: write the whole plan up front, then run ReAct inside each plan step. Structure without rigidity.
Reflection: after each answer, a second LLM call critiques it; the agent revises. Higher quality, double the cost.
Multi-agent: one agent's "tool" is another agent. Orchestrator-worker is the usual shape.

When ReAct is the wrong shape

If you can write the sequence of steps in advance, skip ReAct. ReAct's value is exploration. If there's nothing to explore, the reasoning overhead is wasted tokens and latency. Use a deterministic workflow and save 70% on both.

What to do with this

Build a minimum ReAct loop with 3 tools and a 10-step cap. You'll learn more from that than from reading.
Read tool use for the mechanics of the action step.
Read planning loops for tasks where the path is partially knowable up front.