Sunday 24 June 2007

Microsecond timing with millisecond clocks

The standard C library used with GCC is glibc, it provides POSIX standard functions for timing, sleeping, etc. On Unix platforms such as Solaris on Sparc, HP/UX on PA-RISC these can provide very high resolution timing, nanosecond to microsecond. On Linux 2.6 the resolution is typically 4ms, earlier versions used to be 1ms but certain machine configurations would fail as the timing routine would take longer than 1ms to execute.

Special Linux kernel versions are appearing that support real time or 1ms or finer resolution, for example SUSE Linux Enterprise Real Time (SLERT) or Ubuntu Studio. Using the latter allows microsecond timing with the gettimeofday() function and usleep() to 1ms resolution. In order to get finer grain sleeps we have to create our own routines, a basic loop checking the current time until the microsecond period has expired will do. One caveat that on single core systems the thread in the loop is likely to take all the CPU time, we need to yield the processor to other threads if the timer hasn't expired. In Linux we can use sched_yield(), to be platform we would want to use pthread_yield() however this does not exist with NPTL threads so we can use the Glib thread API version g_thread_yield() instead.

A custom high resolution sleep function doesn't immediately help with a Glib abstracted event loop with timer management either. We need to add a new source to the event loop that can fire events at the new microsecond resolution. To implement this we can derive from the existing timer source base, if the requested sleep time has a low resolution component, e.g. 1.5ms, we can use the existing timer to sleep for 1ms then take over with our high resolution timer for the remaining 500us. The new source is an idle source, that is executes when no other high priority events need to be processed. Effectively Glib is going to run a select()/poll() with a timeout and then execute all the idle sources and repeat. With a low resolution timer the select()/poll() manages the timeout, for high resolution timing it runs with a zero timeout.

In a standard PGM transport we might expect hundreds to thousands of timers awaiting to be fired, from sending session keep alive messages (SPMs) to re-requesting lost data (NAKs). We want to minimize the number of high resolution timers, and minimise overhead of changing timers due to incoming data or receiver state changes and we can do that by managing the entire transport timers internally and presenting one global timer to the underlying Glib event loop. The following diagram shows the two sides of the transport, one of three timers per packet on the receive side: NAK_RB_IVL for NAK request back-off, NAK_RPT_IVL to repeat send a NAK, and NAK_RDATA_IVL to wait for a RDATA if seeing a NAK confirm (NCF); the send side includes an ambient SPM keeping the session alive, and heartbeat SPMs to help flush out trailing packets that might have been lost.


Once Linux implements high resolution timers for select()/poll() this method is no longer required and we should expect improved CPU usage on the timer thread.