Async Coroutines with co_await
The previous lessons built the coroutine machinery — promise types, the coroutine handle, generators with co_yield. This lesson focuses on the other half of the picture: co_await, which suspends a coroutine while waiting for some asynchronous result, then resumes it when the result is ready. The result is code that reads like straight-line sequential logic but executes asynchronously — no callbacks, no state machines, no threads required for concurrency.
The problem: I/O blocks or callbacks
Network I/O, file I/O, and timer waits are slow relative to CPU work — often microseconds to milliseconds when CPU instructions take nanoseconds. The classic approaches each have problems. Blocking I/O is the simplest to write but wastes a thread per connection. Callback-based async I/O scales but splits one logical operation across dozens of small functions, making error handling and control flow nearly unreadable. Thread-per-task avoids callbacks but burns memory (each thread needs a stack) and triggers costly context switches at high connection counts.
Coroutines with co_await offer a fourth path. A coroutine suspends at an await point, releases the thread back to the event loop, and resumes on the same or a different thread when the I/O completes. The code reads sequentially and the stack is off the heap in the coroutine frame — so you can have millions of suspended coroutines with negligible memory cost compared to millions of threads.
Blocking I/O
// Simple, but burns one // thread per connection Socket s = accept(); request = s.read(); response = handle(request); s.write(response);
1 thread per connection. Doesn't scale.
Callback-based
acceptor.async_accept(
[](Socket s) {
s.async_read(
[=](req) {
s.async_write(
handle(req), []{});
});
});Scales, but error handling is brutal.
Coroutines
// Reads like blocking, // executes like callbacks Socket s = co_await accept(); auto req = co_await s.read(); co_await s.write(handle(req));
Scales + readable sequential code.
How co_await suspends and resumes
When the compiler encounters co_await expr, it calls the three awaitable interface functions on expr in sequence. First, await_ready() — if it returns true, the result is already available and the coroutine does not suspend at all. If it returns false, await_suspend(handle) is called with the handle to the current coroutine. This is where the awaitable stores the continuation — when the I/O completes, it calls handle.resume() to wake the coroutine up. Finally, await_resume() provides the actual value that the entire co_await expression evaluates to.
// What the compiler generates for: auto value = co_await some_task;
//
// Step 1: check if result is already ready
if (!some_task.await_ready()) {
// Step 2: save the continuation and suspend
// the awaitable schedules resume when done
some_task.await_suspend(coroutine_handle<>::from_address(frame));
// --- coroutine is now suspended; thread is free to do other work ---
// <resume point>
}
// Step 3: retrieve the value (called after resume)
auto value = some_task.await_resume();The key insight is that await_suspend receives the coroutine handle and decides what to do with it. For a simple in-memory task this might be an immediate handle.resume(). For a real async I/O operation, it would store the handle in a completion callback registered with the OS (via io_uring, epoll, IOCP, etc.) so the coroutine resumes exactly when the data arrives.
Chaining async coroutines
Coroutines compose naturally: a coroutine can co_await another coroutine that itself uses co_await, building a chain of suspended computations. Each suspension yields the thread back to whatever is driving the execution — the polling loop, the event loop, or the runtime scheduler. This is the foundation of structured async code: individual async operations are small, testable coroutines that compose into larger workflows with no callback nesting.
#include <coroutine>
// Forward declaration of our task type (built in the promise-type lesson)
template <typename T = void> struct task;
// Leaf coroutine: produces a value asynchronously
task<int> get_answer()
{
// In a real program, this would co_await a timer or I/O
// Here we co_return directly for illustration
co_return 42;
}
// Middle coroutine: awaits another coroutine's result, then uses it
task<> print_answer()
{
auto t = co_await get_answer(); // suspends here until get_answer() is done
int value = t; // or: int value = co_await get_answer();
std::println("the answer is {}", value);
}
// Driver: executes a coroutine by polling it from non-coroutine context
template <typename T>
void execute(T&& t)
{
while (!t.is_ready())
t.resume();
}
int main()
{
// Pattern 1: get the value from a coroutine directly
auto t = get_answer();
execute(t);
std::println("answer = {}", t.value()); // 42
// Pattern 2: chain coroutines — print_answer drives get_answer internally
execute(print_answer());
}The execute() function is a synchronous driver: it resumes the coroutine repeatedly until it is done. This works for coroutines that suspend and resume synchronously (no OS I/O involved). The important thing to understand is that main() cannot itself be a coroutine — it is one of the functions explicitly excluded by the C++20 standard. So any async chain must be driven from non-coroutine code at its root.
auto, and main().The canonical async pattern: a network server
The textbook demonstration of async coroutines is a network accept loop. Without coroutines, accepting connections, reading a request, and writing a response requires either blocking (one thread per client) or a chain of callbacks. With coroutines, the logic collapses to a loop that reads like synchronous code but suspends the coroutine at each I/O boundary — freeing the thread to handle other connections while this one waits for data.
// Pseudo-code: the coroutine reads like blocking I/O
// but each co_await yields the thread back to the event loop
task<> handle_connection(Socket socket)
{
while (true) {
auto request = co_await socket.read(); // suspend until data arrives
auto response = process(request);
co_await socket.write(response); // suspend until write completes
if (request.is_close()) break;
}
}
task<> accept_loop(Acceptor acceptor)
{
while (true) {
Socket socket = co_await acceptor.accept(); // suspend until connection arrives
// Spawn handle_connection without co_await — fire and forget
// (in cppcoro: schedule_on(pool, handle_connection(socket)))
auto conn = handle_connection(std::move(socket));
conn.detach(); // or schedule on a thread pool
}
}
int main()
{
io_context ctx; // event loop (Asio, io_uring, etc.)
Acceptor acceptor { ctx, 443 };
ctx.run(accept_loop(acceptor)); // drives everything
}The event loop (io_context in this sketch) is what makes the pattern truly non-blocking. When a coroutine hits a co_await acceptor.accept(), it registers a completion callback with the OS and suspends. The event loop processes other ready events — timers, other connections, in-memory tasks — until the OS signals that a new connection has arrived, at which point the event loop resumes the suspended coroutine.
Exception propagation in async coroutines
When an exception propagates out of a coroutine body without being caught, the compiler calls promise.unhandled_exception(). In the minimal task<T> we built in the promise-type lesson, unhandled_exception() calls std::terminate() — fine for illustration, fatal in production. A robust implementation stores the exception with std::current_exception()and rethrows it from await_resume() when the caller retrieves the result. This means exceptions propagate across co_await boundaries exactly as they do in synchronous code — the awaiting coroutine sees the exception as if the co_await expression itself threw.
// Robust promise stores the exception for rethrow
struct promise_base {
std::exception_ptr exception_; // stores an uncaught exception
void unhandled_exception() noexcept
{
exception_ = std::current_exception(); // capture, don't terminate
}
};
// In await_resume: rethrow so the caller sees it
decltype(auto) await_resume()
{
if (!handle_) throw std::runtime_error{"broken coroutine"};
if (auto& ep = handle_.promise().exception_; ep)
std::rethrow_exception(ep); // propagates to the awaiting co_await
return handle_.promise().get_value();
}
// Usage: exception propagates naturally across co_await
task<int> might_throw()
{
throw std::runtime_error{"network error"};
co_return 42;
}
task<> caller()
{
try {
int v = co_await might_throw(); // exception rethrown here
std::println("{}", v);
} catch (const std::exception& e) {
std::println("caught: {}", e.what()); // "caught: network error"
}
}Bridging to main() — the sync_wait pattern
Since main() cannot be a coroutine, you need a synchronous bridge that blocks the current thread until a coroutine chain completes. The minimal version is the execute() polling loop shown earlier. Production code uses a library primitive — cppcoro calls it sync_wait — that integrates with the scheduler and avoids busy-polling. The pattern looks like:
// Minimal polling driver (no scheduler — busy-waits)
template <typename T>
auto sync_wait_poll(T&& task)
{
while (!task.is_ready())
task.resume(); // busy-polls until done
return task.value();
}
// cppcoro's production version (blocks without busy-waiting)
// #include <cppcoro/sync_wait.hpp>
// auto result = cppcoro::sync_wait(my_coroutine());
int main()
{
// With the minimal version:
auto result = sync_wait_poll(get_answer()); // blocks until 42 is ready
std::println("answer: {}", result);
// With cppcoro (integrates with the scheduler, no busy-wait):
// cppcoro::sync_wait(print_answer());
}When async coroutines pay off
High-concurrency network servers (many connections, I/O-bound)
Coroutines + io_uring/IOCP: millions of suspended coroutines vs millions of threads
Composing multiple async operations sequentially
co_await chains: reads sequentially, executes asynchronously
Running a single async computation to completion from synchronous code
sync_wait() or a polling execute() loop — use a library for production
CPU-bound parallelism
std::async / thread pools are still the right tool — coroutines add no parallelism
Simple scripts or tools with one or two async operations
Blocking I/O is fine here — coroutines only pay off at scale