Monday, April 05, 2010

Timeouts by analogy

Most networking-enabled applications have to deal with timeouts. Read or write operations may continue indefinitely, and programs need a way to determine when to tear down connections, resend requests, or take whatever other measures are necessary.

Asio includes the deadline_timer class for managing timeouts. This class aims to provide a minimal interface for scheduling events. Of course, minimalism gives little in the way of design guidance, so some users struggle in finding an elegant way to incorporate timers and timeouts into their programs.

From the minimalist perspective of Asio, there's no one true right way to do it. (Perhaps there's no better proof of that than my design preferences having changed over the years.) Yet that answer doesn't get programs written, so in this post I will try to present a simple mental model for managing timers.

Parking meters

High-traffic, commercial areas near where I live have limited on-street parking. The street parking that is available is metered. It's the usual drill:


  • Park your vehicle.

  • Feed some coins into the parking meter (or, as is more likely these days, swipe your credit card or send an SMS).

  • Go do whatever you came to do.

  • Make sure you return to your vehicle before the meter expires.

If you don't get back in time, you'd better hope your vehicle hasn't had a visit from the parking inspector. A visit means a ticket under the wipers and a nasty fine due.

Parking meters are a good analogy for reasoning about timeouts because it's easy to identify the two actors:


  • The driver of the vehicle.

  • The parking inspector.

The driver performs the following steps:


  1. Feeds the meter.

  2. Leaves the vehicle to run some errands.

  3. Returns to the vehicle.

  4. If no ticket has been issued, repeats from step 1.

  5. If a fine has been issued, goes home.

The parking inspector's job is simple:


  1. Checks whether the meter has expired.

  2. If the meter has expired, writes up a ticket.

  3. If the meter has not expired, notes how much time is remaining.

  4. Goes off for a walk until the remaining time has elapsed.

Using the analogy to inform program design

Hopefully you've already guessed how these actors map to networked applications:


  • The driver represents your protocol handling code.

  • The parking inspector corresponds to your timeout management logic.

Let's take a look at how this works in a very simple use case.

// The "driver" actor.
void session::handle_read(error_code ec, size_t length)
{
// On entering this function we have returned to the vehicle.

if (!ec)
{
// Phew, no ticket. Feed the meter.
my_timer.expires_from_now(seconds(5));

// Process incoming data.
// ...

// Run some more errands.
my_socket.async_read_some(buffer(my_buffer),
bind(&session::handle_read, this, _1, _2));
}
else
{
// We got a ticket. Go home.
}
}

// The "parking inspector" actor.
void session::handle_timeout(error_code ec)
{
// On entering this function we are checking the meter.

// Has the meter expired?
if (my_timer.expires_from_now() < seconds(0))
{
// Write up a ticket.
my_socket.close();
}
else
{
// Note remaining time and go for a walk.
my_timer.async_wait(
bind(&session::handle_timeout, this, _1));
}
}

It's important to remember that the driver may need to run multiple errands each time they leave the vehicle. In protocol terms, you might have a fixed-length header followed by a variable-length body. You only want to "feed the meter" once you have received a complete message:

// First part of the "driver" actor.
void session::handle_read_header(error_code ec)
{
// We're not back at the vehicle yet.

if (!ec)
{
// Process header.
// ...

// Run some more errands.
async_read(my_socket, buffer(my_body),
bind(&session::handle_read_body, this, _1));
}
}

// Second part of the "driver" actor.
void session::handle_read_body(error_code ec)
{
// On entering this function we have returned to the vehicle.

if (!ec)
{
// Phew, no ticket. Feed the meter.
my_timer.expires_from_now(seconds(5));

// Process complete message.
// ...

// Run some more errands.
async_read(my_socket, buffer(my_header),
bind(&session::handle_read_header, this, _1));
}
else
{
// We got a ticket. Go home.
}
}

There are many variations on this theme. For example, you may feed the meter between consecutive errands, varying the amount of money inserted (i.e. setting different length timeouts) depending on which errand comes next. In protocol terms, that might mean allowing up to 30 seconds between messages, but only a further 5 seconds is permitted once the message header has been received.

As I indicated earlier, there's no single right way to manage timeouts. In fact, there are many different facets to this problem that are probably worth exploring in their own right. However, I think that the approach shown here is probably suited to most applications and I would recommend it as a starting point when designing your timeout handling.

No comments: