The agent architecture map
📖 5 min readUpdated 2026-04-19
An agent looks simple in a slide: LLM, tools, loop. In production, it's a stack of components you need to design and operate. Here's the map.
The layers
- User interface, where the agent's invoked (chat, API, CLI, batch)
- Input processing, validation, authentication, session state
- Prompt assembly, system prompt, user prompt, context, memory
- LLM inference, the reasoning model call
- Tool registry, the functions the LLM can call, with descriptions
- Tool execution, calling the tool, handling errors, timeouts
- Memory, short-term (session), long-term (persistent facts)
- Orchestration, the loop that drives reasoning until a stop condition
- Safety + guardrails, prompt injection defenses, content filters, rate limits
- Observability, tracing every step for debugging and eval
- Cost + latency control, budgets, timeouts, caching
- Output formatting, structured output, streaming, finalizing response
The flow
- User input arrives
- System builds context: system prompt + user prompt + relevant memory
- Agent calls LLM with context and available tools
- LLM responds: either a final answer or a tool call request
- If tool call: orchestrator executes, captures result, adds to context
- Loop continues until final answer or max steps
- Final response formatted and returned
- Session state persisted
- Trace logged for observability
Where things break
- Prompt assembly, context too long, too short, stale memory
- Tool descriptions, LLM calls wrong tools with wrong args
- Tool execution, slow APIs, rate limits, error responses
- Infinite loops, agent keeps trying, never reaches stop
- Hallucination in tool args, LLM invents parameter values
- Runaway cost, long sessions, no budget enforcement
The minimum viable production agent
- System prompt with clear goal + tool instructions
- 3-8 well-described tools
- ReAct-style loop with max 10-20 steps
- Structured output schema
- Tracing (every LLM call + tool call logged)
- Cost budget per session
- Timeout per session
- Error handling on every tool
- Eval set of at least 30 test cases
Missing any of these and your agent will fail in production. Not immediately, silently, over time, in ways that are hard to diagnose without the missing piece.