Reflection

Reflection: after the agent produces an answer or a plan, it's prompted to critique itself. "Is this correct? Is anything missing? What would I change?" The revised answer is often meaningfully better than the first.

The pattern

  1. Agent produces initial output (answer, plan, code, whatever)
  2. Reflection step: another LLM call asks "review this output, find issues, suggest improvements"
  3. Revision step: agent produces a new version incorporating the reflection
  4. Optional: loop reflection-revision until satisfied or max iterations

When reflection wins

When it doesn't

Self vs separate critic

Two approaches:

For high-stakes outputs (production code, published content), separate critic is worth the cost.

The empirical gain

On standard benchmarks, reflection improves output quality by 5-20% over no reflection. Bigger gains on complex reasoning tasks, smaller on factual retrieval.

Infinite reflection loops

Without a stop condition, reflection can run forever. Cap at 2-3 iterations and accept diminishing returns.