Customer support agent
📖 3 min readUpdated 2026-04-19
Customer support is the most-deployed agent pattern in production, and also the one most teams get wrong. The concept is simple: read the ticket, retrieve relevant KB content, answer, escalate if stuck. The execution has a dozen failure modes, most of them about being too eager to answer. The best support agents say "I'll get a human on this" more readily than they say "here's my guess."
The flow
Core tools
search_kb(query) - retrieval over your knowledge base. Quality here drives everything.
get_user_context(user_id) - plan, recent activity, tickets. Scoped by user.
create_ticket(summary, priority) - escalation path to humans.
send_reply(user_id, text) - the actual response.
mark_for_review(reason) - flag for later human audit even when auto-responding.
Key design choices
Retrieval quality drives everything
A customer support agent is mostly a retrieval system with an LLM on top. If retrieval returns the wrong articles, no amount of prompting fixes the answer. Invest here first:
- Hybrid search (keyword + semantic) instead of either alone.
- A reranker on the top-20 results to surface the truly relevant.
- Eval on a labeled set of "this query should return that article."
Without retrieval quality, you have a confident hallucinator, not a support agent.
Escalation: the hardest design decision
Every support-agent team tunes this. Escalate too aggressively and humans drown in tickets the agent could've handled. Too rarely and customers get garbage answers to questions the agent shouldn't have attempted.
Good escalation triggers:
- Agent confidence below threshold.
- No relevant KB article above a relevance score.
- Question mentions refund, cancellation, legal, or safety.
- User explicitly asks for a human.
- User sentiment is angry (don't make it worse with a bot reply).
- Question involves making account changes (don't auto-execute).
A worked example: a billing question
- Ticket: "Why did I get charged twice last month?"
- Classify: billing, neutral sentiment.
- Get user context: 2 charges in March, one was a proration, one was the regular monthly bill.
- Retrieve: KB article on proration after plan upgrades.
- Draft reply: explains the two charges with specific dates and amounts, links to the KB article, offers to refund the proration if the user prefers.
- Human approval required (policy: any reply mentioning refunds goes through a human).
- Agent submits the draft, human approves, reply sends.
The agent did 80% of the work; the human approved. Perfect use case for pre-action approval.
Personalization with caution
Pulling user context makes answers dramatically better. It also makes cross-user leaks possible. Rules:
- Every retrieval filters by the current user_id. No exceptions.
- Code-review any path that touches user data for missing filters.
- Don't let the agent share raw tool output with the user. Summarize first.
- Audit a sample of transcripts weekly looking for any cross-user leak patterns.
Hard guardrails
- Never promise refunds, discounts, or policy exceptions without explicit authorization.
- Never acknowledge or deny account specifics the agent can't verify from the tools.
- Escalate any legal threat, abuse, or safety concern to a human immediately.
- Don't respond in anger-matching tone; escalate when sentiment is hostile.
Measurement
- Deflection rate: % of tickets resolved without human touch.
- CSAT on agent-resolved tickets: are users actually happy with the answer?
- Accuracy: did the answer use the right KB article?
- Escalation rate: and subcategories (correct escalation vs "should've handled").
- Resolution time: faster isn't always better if accuracy drops.
Track accuracy and CSAT together. High deflection + low CSAT = agent is just shoving wrong answers at users.
The "too eager to answer" trap
Models are trained to be helpful. That bias makes them over-answer. In support, confident wrong answers are worse than "I'll get a human." Prompt the agent to escalate when anything is uncertain, not to make a best guess.
What to do with this
- Start with draft-first, not auto-send. Measure quality before expanding autonomy.
- Nail retrieval quality before worrying about the agent loop.
- Read human-in-the-loop for approval patterns.
- Read safety + guardrails for the rules every support agent needs.