Customer Support Agent#

Multi-turn customer-support agent with tool use, memory persistence, and human-in-the-loop escalation.

What it does#

Handles a customer query end-to-end: looks up the order, checks shipping status, pulls up FAQ, escalates to a human when the request is high-stakes or the agent has failed repeatedly.

Architecture#

Maps to course chapters:

Ch 3a (Agent Loop) — the ReAct loop dispatching tool calls
Ch 3b (Tools & MCP) — the ToolRegistry holding the 5 lookup tools
Ch 4 (State & Collaboration) — MemoryStore preserving conversation context across turns
Ch 7 (Routing, Compaction & Guardrails) — make_hitl_hook gating irreversible actions (refund > $100, account deletion)

User ──► Agent loop ──► lookup_order / check_shipping_status / list_faq_topics
                  │
                  ├── on 3 repeat errors: create_escalation_ticket + HITL gate
                  └── on irreversible action: HITL gate blocks until approval

Tools#

Tool	Arguments	Returns
`lookup_order(order_id)`	str	dict with status, items, total
`lookup_customer(email)`	str	dict with customer profile
`check_shipping_status(tracking_id)`	str	dict with carrier, status, ETA
`create_escalation_ticket(summary, priority)`	str, str	ticket_id
`list_faq_topics()`	none	list of FAQ topic titles

All five tools are mocked against a fixture database so the whole thing runs in SWARM_MOCK=true with no external dependencies.

Escalation rules (in `escalation_rules.py`)#

3 consecutive tool errors → create ticket with priority=high, block agent
Customer frustration keywords ("frustrated", "angry", "speak to manager", etc.) → create ticket with priority=medium
Refund > $100 → HITL gate, require human approval
Account deletion → HITL gate, require human approval

Cost estimate#

~$0.01 per 3-turn conversation on Haiku (mock mode is free).

Run it#

SWARM_MOCK=true python -m projects.customer_support_agent.agent \
    "I want to return my order #12345"

Expected output: a realistic multi-turn transcript showing the agent looking up the order, checking the shipping status, and either processing the return or escalating.

Run the tests:

SWARM_MOCK=true .venv/bin/pytest projects/customer_support_agent/ -v

Expected: 18 tests passing (happy path, escalation paths, HITL gates, memory continuity).

Sample conversations#

sample_conversations/ has three JSONL fixtures showing expected conversation flows:

return_happy_path.jsonl — simple return, no escalation
refund_over_threshold.jsonl — refund request > $100, HITL gate fires
angry_customer.jsonl — frustration keywords detected, ticket created

Extending for production#

Replace mocked tools with real integrations (Shopify/Stripe/Zendesk APIs)
Wire the HITL gate to Slack: a message goes to #support-approvals with Approve/Reject buttons
Persist MemoryStore to Redis or Postgres instead of local disk so multiple daemons can serve one customer pool
Add the CostGovernor from Appendix F for per-customer budget caps
Add a feedback loop: after escalation, log whether the human's decision matched what the agent proposed — tune the escalation rules based on that signal