Orchestrator-worker
📖 3 min readUpdated 2026-04-19
Orchestrator-worker is the most useful multi-agent pattern in production. One agent is the orchestrator: it reads the task, breaks it up, hands pieces to specialized workers, and stitches the results. The workers are themselves agents, but narrow. Each has its own system prompt, its own tools, its own budget. This pattern is a workflow engine where the workflow is designed by an agent.
The shape
Why split the work
- Specialization. Each worker gets a tight system prompt for its domain. The research worker doesn't have the coding tools cluttering its prompt.
- Parallelism. Independent subtasks run concurrently. Total wall-clock is max(workers) instead of sum.
- Cost control. Orchestrator can use an expensive smart model; cheap workers handle bulk.
- Modularity. One worker breaks, you fix it in isolation without touching the others.
- Tool isolation. Workers only have the tools they need. Smaller attack surface, fewer wrong-tool mistakes.
When it's worth it vs overkill
Workers are just agents with narrower scope
Every worker is itself a small ReAct agent:
- Own system prompt, tuned to its specialty.
- Own subset of tools (research worker: web_search, fetch_page, summarize).
- Own loop with its own step + cost caps.
- Returns a structured result (not free-form text) to the orchestrator.
The orchestrator treats each worker as a single tool call: research_worker(topic: "..."). From the orchestrator's point of view, it's making tool calls. Under the hood, each tool call is a whole sub-agent running.
A worked example: writing a technical report
Task: "Write a 1000-word piece about the state of serverless databases in 2026."
- Orchestrator plans: need research, need outline, need draft. Research and outline can partially overlap, draft depends on both.
- In parallel:
research_worker(topic: "serverless databases 2026") → uses web_search, fetch, returns key facts + 10 source URLs.
research_worker(topic: "limitations of serverless DBs") → returns counter-arguments.
- After research:
outline_worker(facts: ..., style: "technical") → returns a numbered outline.
- Draft:
draft_worker(outline: ..., facts: ..., word_count: 1000) → returns the piece.
- Orchestrator synthesizes: combines the draft with a citation footer from the research URLs, returns to user.
Same task done single-agent would have been 30+ tool calls in one confusing loop with a bloated context. Split, it's 4 bounded workers running in 2 parallel phases. Faster, cheaper, and way easier to debug.
Design checklist
- Strict I/O schema per worker. Structured input, structured output. No free-form parsing.
- Budget per worker. Step cap + cost cap. Otherwise one worker's runaway kills the whole task.
- Orchestrator does coordination, not work. If the orchestrator is calling search itself, the worker isn't doing its job.
- Synthesis step is its own prompt. Don't rely on the orchestrator's implicit synthesis; prompt it explicitly.
- Trace every worker separately. You need per-worker observability to debug.
Pitfalls
- Over-delegation. Orchestrator spawns a worker for a task that's literally one tool call. Now you're paying three model calls for one.
- Inconsistent worker outputs. Worker A returns a list, worker B returns a paragraph. Orchestrator spends half its turns reconciling.
- No per-worker budget. One stuck worker costs $5 and stalls the task forever.
- Synthesis loss. Orchestrator summarizes too aggressively; important details disappear in the final answer.
- Circular delegation. Worker calls orchestrator which calls worker. Infinite recursion; impose a max depth.
What to do with this
- Start from a single-agent ReAct. Profile traces: which tools always cluster together? Those clusters are candidate workers.
- Read agent routing for how to pick which worker gets a given subtask.
- Read agent handoffs for the payload passed between agents.