Thread safety of Timer callbacks

I didn't realize I've stopped blogging for 1 year. What a shame! Fortunately I didn’t waste the time: we ship Whidbey Beta1 and Beta2 in the past year! Now with Beta2 out of door, I have more spare time for blogging. :)

Today I want to talk about some interesting facts about Timer in CLR. There is an example for how to use timer in MSDN: https://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemthreadingtimerclasstopic.asp

This sample starts a timer and does certain things when the timer fires for certain times, like killing the timer. However, this sample has a bug which will cause trouble in stress scenario. To demonstrate the problem, I made a little change to the code:

 

using System;
using System.Threading;

class TimerExample
{
static void Main()
{
AutoResetEvent autoEvent = new AutoResetEvent(false);
StatusChecker statusChecker = new StatusChecker(100);

   // Create the delegate that invokes methods for the timer.
TimerCallback timerDelegate =
new TimerCallback(statusChecker.CheckStatus);

  Console.WriteLine("{0} Creating timer.\n",
DateTime.Now.ToString("h:mm:ss.fff"));
Timer stateTimer =
new Timer(timerDelegate, autoEvent, 0, 10);

   // start another thread to post work items to thread pool
Thread t = new Thread (new ThreadStart (PostWortItem));
t.Start ();

   // When autoEvent signals, dispose of
// the timer.
autoEvent.WaitOne();
stateTimer.Dispose();
Console.WriteLine("\nDestroying timer.");
}

  // a Thread proc which keeps posting work items to thread pool
static void PostWortItem ()
{
// Post some user work items to thread pool
for (int i = 0; i < 1000; i++)
{
ThreadPool.QueueUserWorkItem (new WaitCallback (WorkItem));
Thread.Sleep (10);
}
}

  // An nop work item for thread pool
static void WorkItem (object o)
{
Thread.Sleep (500);
}
}

class StatusChecker
{
int invokeCount, maxCount;

public StatusChecker(int count)
{
invokeCount = 0;
maxCount = count;
}

  // This method is called by the timer delegate.
public void CheckStatus(Object stateInfo)
{
Console.WriteLine("Checking status " + (++invokeCount));

  if(invokeCount == maxCount)
{
//signal Main.
AutoResetEvent autoEvent = (AutoResetEvent)stateInfo;
autoEvent.Set();
}
}
}

 

Basically I added another thread to keep posting work items to threadpool, but the rest part is still expected to behave the same: when the timer fires the 100th time, it should set an event so the main thread would stop the timer.

 

In one of 5 runs in my machine, I got such output:

 

5:48:07.625 Creating timer.

Checking status 1
Checking status 2
Checking status 3
Checking status 4

Checking Status 93
Checking Status 94
Checking Status 95
Checking Status 96
Checking Status 97
Checking Status 98
Checking Status 102
Checking Status 99
Checking Status 103
Checking Status 104
Checking Status 105

Checking Status 698
Checking Status 700
Checking Status 701
Checking Status 703
Checking Status 703
Checking Status 704
Checking Status 705

^C

It seems that invokeCount never hits 100 thus the program doesn't stop and some other sequence in the output looks to be out of order.  

 

How does this happen? First we need to understand how timer is implemented in CLR, who is executing the timer callbacks?

 

One simple idea would be putting all timers in a queue and having a dedicate thread doing something like this (pseudo code):

**

while (true)
{
foreach (Timer t in timer queue)
{
if (t.TimeToFire ())
{
t.InvokeCallback ();
}
}
Sleep(MinumInterval);
}

However with this logic one lengthy timer callback would block all other timers. In CLR, we do have a timer queue and a dedicate timer thread. However the only job of timer thread is to maintain the timer queue, when a timer needs to fire, timer thread queue a work item to threadpool, then a thread pool's worker thread will pick up the work item and invoke the timer callback.

 

In Rotor's source, the timer thread's logic is in vm/Win32threadpool.cpp, the thread proc is ThreadpoolMgr::TimerThreadStart and ThreadpoolMgr::FireTimers does most of interesting work. The pseudo code looks like:

 

while (true)
{
foreach (Timer t in timer queue)
{
if (t.TimeToFire ())
{
// put a work item to thread pool
// to call timer cal back on t once
WorkItem work = CallTimerCallbackOnce (t);
ThreadPool.QueueWorkItem (work);
}
}
//MinumInterval is minum of next firing interval
// for all timers in the queue
Sleep(MinumInterval);
}
 

The timer thread only guarantees to put timer callback requests to a queue in thread pool (ThreadpoolMgr::QueueUserWorkItem) in order of timer firing. But timer callbacks are not called in a serialized way. If a timer fires twice and there are more than one worker thread in thread pool, there's no guarantee that the first callback will be finished before the next callback starts. Therefore, it's not thread safe for timer callbacks to access shared data without locking. That's why the example in MSDN breaks:  when CheckStatus is executed in multiple threads, it's possible that "if (invokeCount == maxCount)"   will never be satisfied. Changing the code to this would make it more robust:

 

public void CheckStatus(Object stateInfo)
{
int count = Interlocked.Increment (ref invokeCount);
Console.WriteLine("Checking status " + count);

if(count == maxCount)

Another interesting thing about timer implementation is that when a client thread creates a new timer, it doesn't insert the timer to timer queue directly. Instead, it queues a user APC to the timer thread (see ThreadpoolMgr::CreateTimerQueueTimer and InsertNewTimer). Similar thing is done for updating (ThreadpoolMgr::ChangeTimerQueueTimer and UpdateTimer) and deleting timer (ThreadpoolMgr:: DeleteTimerQueueTimer and DeregisterTimer). That way, client threads don't need to synchronize to access the shared timer queue. After all, the timer thread is sleeping (alertable) for most of time.

PS: to make the race happen more easily, I did more tweaks to the MSDN sample than the threadpool workitems, it should be obvious to you. ;)

This posting is provided "AS IS" with no warranties, and confers no rights.