How to Read This Book#

This book does not work in isolation. It is one part of a three-part course:

The Book (book/) — you are here
The Repo (modules/) — where you implement things
The YouTube Series (episodes/) — watch before or after reading each chapter

The System You Are Building#

This book builds an agentic AI system in nine layers. Each chapter adds one primitive on top of the last:

Layer 9  Capstone                          ← full swarm, benchmarked on SWE-bench and GAIA
Layer 8  Production, Skills & Plugins      ← daemon, crash recovery, skill library, .plugin bundle
Layer 7  Routing, Compaction & Guardrails  ← tier routing, 5 compaction strategies, hooks, HITL
Layer 6  Orchestrator-Workers              ← fork-join, KV cache inheritance, mixture of agents
Layer 5  Evaluation & Observability        ← LLM-as-judge, position bias, OpenTelemetry
Layer 4  State & Collaboration             ← 3-layer memory, autoDream, generator/critic
Layer 3  Agent Loop, Tools & MCP           ← ReAct loop, tool registry, sandbox, MCP server
Layer 2  Providers & Prompt Caching        ← multi-provider chokepoint, cache_control
Layer 1  Raw Call                          ← single HTTP call to an LLM API

Every layer depends on the one below. By the end you will have built every piece by hand — no frameworks, no black boxes.

Fast Track (for experienced readers)#

If you already work with LLMs and want the core build in three sessions:

Session 1: Chapters 1–3 (raw call → providers → agent loop + tools + MCP) — you get a working agent
Session 2: Chapter 4 (state and collaboration — memory + two agents) — your agent can remember and self-critique
Session 3: Chapters 6 + 9 (orchestrator-workers → capstone) — you get a parallel swarm running on SWE-bench

The remaining chapters (5, 7, 8) add evaluation, routing-and-guardrails, and production polish. Come back when you need them.

The Expected Workflow#

For each chapter, in order:

Read the chapter — start with book/chNN_*.md or the matching modules/NN_*/lesson.md (same content). Understand the primitive before touching code.
Watch the episode — the YouTube episode is a live implementation session. Watch to see how an experienced engineer approaches it, but try it yourself first.
Do the exercises — open modules/NN_*/exercises/ and work through them in order without peeking at code/ or solutions/.
Grade yourself — bash scripts/grade_module.sh NN. If tests fail, the output tells you what's wrong. Fix and re-run. Do not proceed until all tests pass.
Observe the failure — the last section of each chapter describes a specific failure mode. Run modules/NN_*/what_goes_wrong.py and watch it break. This is the motivation for the next chapter.
Read the next chapter — you are ready.

Reading the SOTA Guide#

README_SOTA.md is a production reference, not a tutorial. It is most useful after you've completed a chapter and want to understand why the production implementation makes the choices it does. Read it reactively, not proactively.

Appendices#

The appendices are reference material, not required reading:

Appendix A — Frameworks survey: when deciding between building and buying
Appendix B — Benchmark methodology: read before Chapter 9
Appendix D — Glossary: look things up as needed
Appendix E — Bibliography: follow citations when you want depth

On Skipping Chapters#

Don't. The failure-to-motivation chain is load-bearing. If you skip Chapter 4 (state and collaboration), Chapter 5 (evaluation) won't feel necessary — you won't have watched two agents argue about which output is "better" without any way to measure it. The pedagogy depends on experiencing each failure before fixing it.

If you already know a chapter's content cold, at least skim the chapter and run the grader to confirm your existing implementation matches the spec.