Appendix: Async in 10 Minutes#
If you know sync Python but have never written async def, this is the shortest path to reading and writing the agent code in this book. It is not the full story of asyncio, it is the minimum you need to stop being confused by the examples.
Why async exists#
An LLM call spends most of its wall time waiting. You send a few kilobytes of JSON to api.anthropic.com, the server thinks for two seconds, and ships a few kilobytes back. During those two seconds your Python process is idle. CPU is pinned at zero. Network is idle between the send and the receive. You are, concretely, paying for two seconds of wall clock to do about two milliseconds of work.
If you need to call three LLMs, sync Python takes three times two seconds. Async Python takes two seconds total, because the runtime can suspend each call while it waits and let the others make progress. The wins compound as soon as you fan out to workers, tool servers, or a batch of evaluations. This is why every agent framework in this book is async.
async def vs def#
async def declares a coroutine. Calling it does not run the body. It returns a coroutine object, which is a thing the event loop can schedule. Calling it and ignoring the result is almost always a bug.
async def fetch(url): ...
fetch("http://x") # returns a coroutine, does nothing
await fetch("http://x") # runs it, returns its result
Think of async def as "this function knows how to pause." A plain def cannot pause, so it cannot participate in concurrency without threads.
await and where it is legal#
await is how you run a coroutine and wait for its result. It is only legal inside an async def body. Putting await in a regular function is a SyntaxError. Putting it at the module top level is also usually illegal, except in an interactive REPL that supports it.
When the interpreter hits await foo(), it runs foo until foo itself awaits something slow (a socket, a timer). At that point the event loop takes over, runs other ready coroutines, and resumes you when foo's answer arrives.
asyncio.run is a one-way door#
Your program is either sync or async. asyncio.run(coro) is the crossing. It starts an event loop, runs the coroutine to completion, and shuts the loop down. Call it exactly once, at the top of your program.
async def main():
result = await some_coro()
print(result)
if __name__ == "__main__":
asyncio.run(main())
Do not call asyncio.run inside library code. Do not call it from inside another coroutine. It creates and destroys a loop each time, and nesting is not supported.
asyncio.gather runs coroutines in parallel#
asyncio.gather(*coros) schedules N coroutines at once and returns their results in order when all are done. This is the workhorse for fanning out LLM calls, tool calls, or worker agents.
If any one of them raises, gather raises. Pass return_exceptions=True if you want the exceptions as values instead.
Common errors, decoded#
RuntimeWarning: coroutine 'foo' was never awaited. You called an async function but forgot await. The coroutine was created, never run, and garbage collected. Add await.
SyntaxError: 'await' outside async function. You used await inside a plain def. Either change the outer function to async def, or call the coroutine from an async def caller.
RuntimeError: no running event loop. Something async-only was called from sync code with no loop active. You probably need asyncio.run(main()) at the entry point, and the async-only code needs to live inside main.
A runnable example#
Two fake LLMs, in parallel:
import asyncio, random, time
async def fake_llm(name: str) -> str:
await asyncio.sleep(random.uniform(0.5, 1.0)) # pretend network
return f"{name} replied"
async def main() -> None:
t0 = time.perf_counter()
a, b = await asyncio.gather(fake_llm("haiku"), fake_llm("sonnet"))
print(a, b, f"({time.perf_counter() - t0:.2f}s)")
asyncio.run(main())
Total wall time is the slower of the two calls, not the sum. Swap the sleep for real HTTP and the shape of the code does not change.