Async I/O

When you look at servers, it feels like they're doing all sorts of smart stuff. But if you step back, most of the time they're just... moving bytes around. Read some data, write some data, wait for more. It's kind of boring, but also, that's the core of everything.

I keep coming back to IO because no matter what kind of system we try to build, it's always there underneath. Frameworks usually hide the details - which is convenient - but I kind of want to poke at it myself. Like, how exactly do those bytes get packed up and shipped across the wire?

Terminology

"IO" can mean a lot of different things - reading from disk, sending bytes over the network, or even talking to hardware through DMA. Since we're focused on distributed systems, the piece we care about most is network IO. Network devices are accessible via sockets. A socket is basically the OS handing you a handle and saying: "use this to send bytes out, or I'll poke you when new bytes show up.". Every major operating system has this concept.

Now, sockets aren't all the same. They can work in blocking mode, where a thread waits until data shows up, or in non-blocking mode, where you tell the OS "let me know when this is ready." Blocking is easy to use but doesn't scale if you've got lots of connections - threads just end up sitting idle. Non-blocking avoids that by handing control back to you until there's actual work to do.

On Linux, this is achieved via epoll which gives us a way to monitor many sockets at once and only act on the ones that are ready. Libraries like Asio for C++ or Tokio for Rust use it internally.

Events, events are everywhere

At the bottom of it, our toy distributed system would have to talk to other machines via some sort of RPC. Sure, we could just grab an off-the-shelf RPC library and call it a day... but what's the fun then, right? In this post I want to poke right at the heart of efficient IO.

Interface

For this task, we won't be creating a fully functional event loop. Instead, we'll just focus on notification mechanism. Let's call it EventWatcher. It allows to watch on a file descriptor and call a IWatchCallback callback whenever file descriptor is ready to reading or writing.

class EventWatcher {
public:
    void watch(int fd, WatchFlag flag, IWatchCallbackPtr cb);
    void unwatch(int fd, WatchFlag flag);
    void unwatchAll();
};

The callback is called once fd (e.g. socket) is ready for I/O.

class IWatchCallback {
public:
    virtual void run(int fd) = 0;
};

In the /tasks/async-io directory you'll find a half-baked implementation of EventWatcher. Feel free to tear it apart and change things however you like - but the main missing pieces are already called out with blocks like this:

// ==== YOUR CODE: @0000 ====
...
// ==== END YOUR CODE ====

Flexing, multiplexing

The man page for epoll is great. It tells you what calls to use, all the ways you can shoot yourself in the leg, and even has code samples for the loop. That's basically what EventWatcher::waitLoop is supposed to be doing.

Callbacks

So what happens after epoll_wait hands us a batch of ready file descriptors (maybe)? We grab each one, find its callback, and run it. But where? Inside the event loop thread itself, or offload to some thread pool?

For pure IO, throwing it to another core usually doesn't help and often just slows things down with extra context switching. But once we step into request handling, it's a different story. A typical server besides just reading and write bytes also parses the request, maybe talks to a database, maybe does some business logic, maybe kicks off more IO before replying. That's the kind of work where offloading to a pool can actually pay off.

Handle watch / unwatch

One gotcha: it's possible the epoll_wait will hang forever if no events come in. That means newly added fds won't be picked up until the current wait cycle ends which, well, may not happen. In the pasts I used timeouts for that kind of stuff, but recently discovered self pipe trick.

🧠 Task

Your task is to implement waitLoop method of the EventWatcher class using epoll_wait.

Testing

Tests are located in event_watcher_test.cpp.