Multi-agent orchestration
📖 5 min readUpdated 2026-04-18
Sometimes one agent isn't the right unit. Splitting work across multiple agents, with isolated context and specialized roles, can outperform a monolith. It can also be over-engineering. Know which is which.
The basic pattern: orchestrator-worker
One "orchestrator" agent decomposes a task into sub-tasks. Each sub-task is handed off to a "worker" agent with its own fresh context. Workers return results; the orchestrator synthesizes the final answer.
User goal → Orchestrator plans → spawn Worker 1, Worker 2, Worker 3
↓
Orchestrator collects results → synthesizes → returns
Why it helps
- Context isolation. Each worker starts clean, the orchestrator doesn't fill up with every worker's scratch work.
- Parallelism. Independent workers run concurrently. A 10-minute task becomes 3 minutes.
- Specialization. Different workers can use different models or different tool sets.
- Reusability. Workers become reusable functions.
Where it backfires
- Latency explosion. Each spawned agent is a model call. If workers need to be serial, you've multiplied latency, not reduced it.
- Coordination cost. Orchestrator now has to describe the task, parse the result, handle errors. Sometimes this overhead exceeds the task itself.
- Lost context. Worker doesn't know what the orchestrator learned in previous steps unless you pass it explicitly.
- Error propagation. One worker fails, how does the orchestrator know? What does it do?
When to use it
Good fits:
- Independent research subtasks (search A, search B, search C → synthesize)
- Parallel file processing (agent per file)
- Role-based workflows (one agent writes code, another reviews)
- Context-heavy sub-tasks you don't want polluting the main context
Bad fits:
- Linear, dependent steps (step 2 needs step 1's result)
- Short tasks (overhead > benefit)
- Tasks where the orchestrator could have just done it
Delegation discipline
If you're spawning a sub-agent, the parent must:
- Write a complete task description. The worker has no memory of the parent's thinking.
- Specify the return format. Unstructured worker output is a pain to consume.
- Set a time/turn budget. Workers can spin forever.
- Handle worker failure. Assume it will fail; decide what happens.
Claude Code sub-agents
Claude Code has built-in support for sub-agents. You can invoke specialized agents (Explore, Plan, general-purpose) that run with isolated context and return a single message back to the parent. For research-heavy or branching work, sub-agents are a major productivity unlock.
Peer-to-peer (usually avoid)
Two agents that chat directly, no orchestrator. Popular in demos. Rarely worth it in production, most "P2P agent" setups are better modeled as orchestrator-worker with one role delegating.
Watch out: multi-agent architectures are fashionable. That doesn't make them right. The single-agent ReAct baseline is often faster, cheaper, and more debuggable.