Crafting Agentic Swarms#
Build a production AI agent swarm from a single httpx request — no frameworks, no magic, no skipped steps.#
By the end of this course you will have written — from scratch — every component that powers the AI agents you use every day: the API call, the tool loop, the memory system, the evaluation harness, the safety layer, and the orchestrator that coordinates dozens of parallel workers.
Start with Chapter 1 → Download the book → Open a Colab notebook →
Download the book#
203 pages · 10 chapters · 15 appendices · MIT-licensed · generated from the same markdown you're reading here.
-
PDF
Print-ready. 203 pages, Georgia body, JetBrains Mono code, high-resolution diagrams. Best for reading on a laptop or tablet.
-
EPUB
Works on Kindle, Apple Books, Kobo, iPhone, Android. Reflowable — respects your font size and line spacing.
-
DOCX
Microsoft Word. Useful for annotating chapters, dropping excerpts into slides, or importing into a publishing pipeline.
-
AsciiDoc
Source format. Convert to anything — O'Reilly Atlas, Manning AsciiDoctor, Pandoc, custom toolchains.
Prefer a marketplace? Leanpub (pay-what-you-want), Amazon Kindle/Paperback, and Gumroad links are rolling out — tracked on the Downloads page.
Run the code#
Every major primitive in the book has a matching Colab notebook — 11 in total. They run in your browser, free, on Google's infrastructure. Default mode is SWARM_MOCK=true so the exercises cost $0 until you explicitly plug in an API key.
-
01 · Token Mechanics
Tokenize a sentence, watch the attention mask, see why prompt length drives latency and cost.
-
03a · The ReAct Loop
A 30-line agent loop, live. Observe how a wrong tool call in step 2 corrupts the whole trace.
-
03b · Tools & MCP
Register a tool, call it, wrap the same server as MCP, swap transports without changing the agent.
-
04 · Memory Visualisation
The three-layer memory system — working, episodic, semantic — with a matplotlib timeline of what got consolidated and why.
-
05 · Eval & Pareto
LLM-as-judge with position-bias mitigation, bootstrap confidence intervals, Pareto frontiers over cost vs accuracy.
-
07 · Cost Routing
A router that picks between Haiku, Sonnet, and Opus per query. Charts show where the learned baseline beats the heuristic.
Plus five more — full notebook gallery →
Ship a project this weekend#
Six portfolio projects, each ~500–1,500 lines, each with tests, each forkable. Pick the one closest to what you actually want to build.
-
Customer Support Agent
Tool-using agent over a support ticket DB. Escalation logic, refund policy guardrails, deterministic eval harness.
-
Code Review Bot
Reviews PRs against a style guide. Worker-pool parallelisation across files. Posts inline GitHub comments.
-
Data Analyst Agent
Natural-language → SQL → chart. Critic checks the SQL before it runs. Sandbox isolates the database.
-
Research Assistant
Multi-source retrieval with citation tracking. LLM-as-judge scores answer grounding.
-
Knowledge-Base RAG
Vector retrieval over documentation. Hybrid search (BM25 + embeddings) with reranker.
-
Multi-Agent Debate
Two agents argue, a judge scores. Shows how adversarial setups improve answer quality on ambiguous questions.
Why this course exists#
Most AI agent courses teach you to configure frameworks. This one teaches you to build the thing frameworks are wrapping.
There is a difference. When LangChain updates its API (it does, constantly), framework users update their imports. Engineers who understand the underlying loop debug the actual failure, fix it in ten minutes, and move on. This course is for the second type of developer.
The structure is borrowed from Nand2Tetris — the famous computer-science course where you build a computer from NAND gates up to a running operating system. Same idea, new domain: one HTTP call up to a full production multi-agent swarm.
The one rule: you build every primitive before you're allowed to abstract it away.
Who this is for#
-
Software engineers new to agents
You've used LLMs but when something breaks, you don't know why. After Chapter 3 you can write a production tool executor from scratch. After Chapter 5 you can evaluate an agent system without outsourcing judgment to a leaderboard.
-
Backend engineers going deep
You know distributed systems, async, observability. The course maps your existing mental models onto the agent layer. Jump to Chapter 8 for the production daemon + multi-tenant cost governance.
-
ML engineers transitioning
You understand models. Concurrency, state management, crash recovery, and cost routing are new territory. The eval harness in Chapter 5 will feel native; Chapters 6-8 fill the system-engineering gap.
How the course works (repeat 9 times)#
1. Read the chapter ← textbook section for this module
2. Study the reference code ← the complete implementation
3. Fill in the exercises ← your work, from scratch
4. Run the auto-grader ← bash scripts/grade_module.sh NN
5. Observe the failure ← run the deliberate-break demo
6. That failure motivates the next chapter → proceed
Do not skip the failure step. Each chapter ends on a broken system on purpose. That break is the emotional hook that makes the next chapter feel necessary rather than arbitrary.
What you'll ship#
- A production
call_agent()with multi-provider support, prompt caching, retry, streaming, and cost tracking - A ReAct agent loop with a sandboxed tool executor and a live MCP server you wrote yourself
- A three-layer memory system with scheduled consolidation and optional vector-DB backend
- A generator/critic pair that refines its own output
- An evaluation harness with LLM-as-judge, position-bias mitigation, statistical-significance testing, OpenTelemetry traces
- A parallel swarm with fork-join orchestration, KV cache inheritance, and a DAG executor for complex workflows
- A cost-aware router (heuristic + a learned baseline) and five compaction strategies
- A safety layer with a 29-event hook bus, Constitutional AI rules, human-in-the-loop gates, and prompt-injection defense
- A production daemon with crash recovery, a Voyager-style skill library, and a per-tenant cost governor
- A Claude Code plugin that bundles a skill, a hook, and an MCP server
- A complete run on SWE-bench Lite and GAIA Level 1
- 6 portfolio projects you can fork and ship this weekend: customer support, code review, data analyst, research assistant, knowledge-base RAG, multi-agent debate
What you'll need#
- Python 3.11+ — async/await,
tomllib, exception groups - An API key (optional) — every exercise runs offline in mock mode with
SWARM_MOCK=true. Anthropic key is enough when you're ready for the real thing; other providers exercised in Chapter 2. - Git — Chapter 6 uses worktrees for parallel agent isolation
- A terminal and a text editor. No notebook required — production agents run as processes, not notebooks.
First-time setup is in the repo README.
A note on "gold standard"#
Some people will ask whether this course is the Nand2Tetris of agent engineering.
It's not. Not yet. Nand2Tetris earned that label through 20+ years of use in hundreds of universities, thousands of errata cycles, and peer review by two generations of CS educators. This book is a promising first draft of something that could earn the label through the same path.
What you have here:
- 203 pages, 10 chapters, 15 appendices
- 12 modules with 50+ auto-graded exercises and reference solutions
- 11 Colab notebooks with live visualisations
- 6 runnable portfolio projects (~80 passing tests across them)
- A production reference package (
swarm/) with 100+ passing tests - The full supporting apparatus: instructor guide, debugging playbook, async primer, glossary, bibliography
Read it. Fork a project. Tell us what broke. Every errata issue, every forum question, every pull request is a step toward whatever this book actually becomes.
Get involved#
- GitHub Discussions — ask questions, share what you built, flag errata
- File an issue — typos, bugs, suggestions
- Fork the repo — it's MIT-licensed; use it at your company, teach it at your university, extend it however you want
What's next#
If you're ready to start: Chapter 1 — The Raw Call →
If you've finished the course: What's next →
If you're teaching it: Instructor Guide →
If you want all the formats with a bibtex entry: Downloads →