The agent architecture map

An agent looks simple in a slide: LLM, tools, loop. In production, it's a stack of components you need to design and operate. Here's the map.

The layers

  1. User interface, where the agent's invoked (chat, API, CLI, batch)
  2. Input processing, validation, authentication, session state
  3. Prompt assembly, system prompt, user prompt, context, memory
  4. LLM inference, the reasoning model call
  5. Tool registry, the functions the LLM can call, with descriptions
  6. Tool execution, calling the tool, handling errors, timeouts
  7. Memory, short-term (session), long-term (persistent facts)
  8. Orchestration, the loop that drives reasoning until a stop condition
  9. Safety + guardrails, prompt injection defenses, content filters, rate limits
  10. Observability, tracing every step for debugging and eval
  11. Cost + latency control, budgets, timeouts, caching
  12. Output formatting, structured output, streaming, finalizing response

The flow

  1. User input arrives
  2. System builds context: system prompt + user prompt + relevant memory
  3. Agent calls LLM with context and available tools
  4. LLM responds: either a final answer or a tool call request
  5. If tool call: orchestrator executes, captures result, adds to context
  6. Loop continues until final answer or max steps
  7. Final response formatted and returned
  8. Session state persisted
  9. Trace logged for observability

Where things break

The minimum viable production agent

Missing any of these and your agent will fail in production. Not immediately, silently, over time, in ways that are hard to diagnose without the missing piece.