Skip to content

Customer Support Agent#

Multi-turn customer-support agent with tool use, memory persistence, and human-in-the-loop escalation.

What it does#

Handles a customer query end-to-end: looks up the order, checks shipping status, pulls up FAQ, escalates to a human when the request is high-stakes or the agent has failed repeatedly.

Architecture#

Maps to course chapters:

  • Ch 3a (Agent Loop) — the ReAct loop dispatching tool calls
  • Ch 3b (Tools & MCP) — the ToolRegistry holding the 5 lookup tools
  • Ch 4 (State & Collaboration)MemoryStore preserving conversation context across turns
  • Ch 7 (Routing, Compaction & Guardrails)make_hitl_hook gating irreversible actions (refund > $100, account deletion)
User ──► Agent loop ──► lookup_order / check_shipping_status / list_faq_topics
                  ├── on 3 repeat errors: create_escalation_ticket + HITL gate
                  └── on irreversible action: HITL gate blocks until approval

Tools#

Tool Arguments Returns
lookup_order(order_id) str dict with status, items, total
lookup_customer(email) str dict with customer profile
check_shipping_status(tracking_id) str dict with carrier, status, ETA
create_escalation_ticket(summary, priority) str, str ticket_id
list_faq_topics() none list of FAQ topic titles

All five tools are mocked against a fixture database so the whole thing runs in SWARM_MOCK=true with no external dependencies.

Escalation rules (in escalation_rules.py)#

  • 3 consecutive tool errors → create ticket with priority=high, block agent
  • Customer frustration keywords ("frustrated", "angry", "speak to manager", etc.) → create ticket with priority=medium
  • Refund > $100 → HITL gate, require human approval
  • Account deletion → HITL gate, require human approval

Cost estimate#

~$0.01 per 3-turn conversation on Haiku (mock mode is free).

Run it#

SWARM_MOCK=true python -m projects.customer_support_agent.agent \
    "I want to return my order #12345"

Expected output: a realistic multi-turn transcript showing the agent looking up the order, checking the shipping status, and either processing the return or escalating.

Run the tests:

SWARM_MOCK=true .venv/bin/pytest projects/customer_support_agent/ -v

Expected: 18 tests passing (happy path, escalation paths, HITL gates, memory continuity).

Sample conversations#

sample_conversations/ has three JSONL fixtures showing expected conversation flows:

  • return_happy_path.jsonl — simple return, no escalation
  • refund_over_threshold.jsonl — refund request > $100, HITL gate fires
  • angry_customer.jsonl — frustration keywords detected, ticket created

Extending for production#

  • Replace mocked tools with real integrations (Shopify/Stripe/Zendesk APIs)
  • Wire the HITL gate to Slack: a message goes to #support-approvals with Approve/Reject buttons
  • Persist MemoryStore to Redis or Postgres instead of local disk so multiple daemons can serve one customer pool
  • Add the CostGovernor from Appendix F for per-customer budget caps
  • Add a feedback loop: after escalation, log whether the human's decision matched what the agent proposed — tune the escalation rules based on that signal