Preventing Race Conditions in Code That Accesses Global Data

Abstract

Race conditions in C/C++ code are amazingly easy to introduce and notoriously difficult to debug. The costs associated with a race condition can vary from 0 to “very expensive” depending on where in the code they occur. For example, if there is a bug in a driver that causes it to write to memory that it doesn’t own, the symptoms will most likely be totally arbitrary, possibly showing up as kernel or user mode heap corruptions, stack trashes or hardware malfunctions. Developers throughout different parts of the system could spend a lot of time debugging issues that are really just symptoms of a problem in the underlying code. This is why it is so important to code defensively against race conditions. Even in environments where the behaviors of the system are known (like an embedded system) it is important that the code express the intent of the programmer, even if the output that the compiler generates for that code does not exhibit bad behavior today. That code can and will be copied and potentially ported forward to new environments, thus propagating any existing bugs. It is important that code be free of race conditions at all levels – source code level, assembly code level and object code level.

Example

So what does a global data race condition look like? Let’s consider this code:

// Global data declaration and initialization

ULONG GlobalData;

// Original Thread 1 code

ULONG LocalData;

if (GlobalData != 0) {

    // ß Race can occur here

    LocalData = GlobalData;

    // Go do work using LocalData

}

// Thread 2 code

if (SomeCondition != FALSE) {

    GlobalData = 0;

}

The problem with the code for Thread 1 is that the value of GlobalData can be changed asynchronously by Thread 2 while it is reading from it. In other words, GlobalData may not be 0 in the condition check “if (GlobalData != 0)“, but may be 0 by the time “LocalData = GlobalData “ is executed.

Fix – Attempt #1

We may decide to fix the problem by capturing the value to a local variable and then doing the operations on the local:

// Thread 1 code (using capture)

ULONG CapturedGlobalData;

ULONG LocalData;

CapturedGlobalData = GlobalData;

if (CapturedGlobalData != 0) {

    LocalData = CapturedGlobalData;

}

At first glance this seems like a great solution; however, an optimizing compiler is free to remove the assignment to the local variable CapturedGlobalData because, as far as the compiler is concerned, CapturedGlobalData is just an alias for GlobalData. This is clearly not what we intended in our code. What we really want to happen is to store the current value of GlobalData locally and then operate on that saved value to ensure that we operate on a static value.

Fix – Attempt #2

So how do we write the code to ensure that we are always operating on a static value? The simplest way would be to always use a lock around all accesses to GlobalData like so:

// Thread 1 code (using a lock)

Lock(GlobalDataLock);

if (GlobalData != 0) {

    LocalData = GlobalData;

}

Unlock(GlobalDataLock);

// Thread 2 code

if (SomeCondition != FALSE) {

    Lock(GlobalDataLock);

    GlobalData = 0;

    Unlock(GlobalDataLock);

}

This code ensures that only one thread operates on (reads or writes) GlobalData at a time. Using locks is the correct way to fix code like this 99% of the time. Also, lockless programming is fraught with danger and is very, very difficult to get correct. But for the sake of this example and the fact that we don’t need a completely synchronized view of the variable – GlobalData let’s continue on and say that we used locks and they were too slow – even though if this was true you could make an argument for optimizing the locks instead of writing lock free code – but – I digress.

 For this code, we only need to make sure that the value of GlobalData stays consistent for the duration of the time we are using it – we don’t need to block anyone else – we just need to grab the value and operate on it – so a non-synchronized view of GlobalData is fine, an inconsistent view of it is not.

Fix – Attempt #3

Let’s improve the performance of the code with regard to accesses of GlobalData, without losing the consistent view. We can do this like so:

// Global data declaration and initialization

volatile ULONG GlobalData;

// Thread 1 code

ULONG CapturedGlobalData;

ULONG LocalData;

CapturedGlobalData = GlobalData;

if (CapturedGlobalData != 0) {

    LocalData = CapturedGlobalData;

}

// Thread 2 code

if (SomeCondition != FALSE) {

    GlobalData = 0;

}

Definition of volatile From CPPReference.com: The volatile keyword is an implementation-dependent modifier, used when declaring variables, which prevents the compiler from optimizing those variables. Volatile should be used with variables whose value can change in unexpected ways (e.g. through an interrupt), which could conflict with optimizations that the compiler might perform.

Now with GlobalData declared volatile, the single read has to be preserved. The original use of volatile was for programs that talked directly to hardware – where every action can have side-effects. For instance – consider this code:

*p;

This line of code might be the code to launch a nuclear missile if the “Launch” register is mapped to the address that p points to. As a result, the compiler can make no assumptions about what to do with a volatile variable. So making GlobalData volatile fixes the problem. However, we have now also prevented the compiler from optimizing any uses of GlobalData. This is for the same reason it fixes our problem – the compiler can’t make any assumptions about the use of GlobalData .

Fix – Attempt #4

What we really want is the best of both worlds – for GlobalData to be treated as if it was volatile for the code in Thread 1, but not force that restriction onto the rest of the system. Well, as it turns out we can do exactly that with some clever casting at the point of assignment:

// Global data declaration and initialization

ULONG GlobalData;

// Thread 1 code

ULONG CapturedGlobalData;

ULONG LocalData;

CapturedGlobalData = *(ULONG volatile *)&GlobalData;

if (CapturedGlobalData != 0) {

    LocalData = CapturedGlobalData;

}

// Thread 2 code

if (SomeCondition != FALSE) {

    GlobalData = 0;

}

Here we are telling the compiler to treat GlobalData as if it were a pointer to a volatile ULONG and then dereference it. This will force the local capture to occur and guarantee no re-fetch from GlobalData. The syntax is a little messy and could be wrapped up into a macro like this:

//

// Macro used to ensure that a variable is captured and

// not optimized away by the compiler

//

#define VOLATILE_READ(Target, TargetType) \

                      *(TargetType volatile *)&Target

Then the original code could be written as such:

// Global data declaration and initialization

ULONG GlobalData;

// Thread 1 code

ULONG CapturedGlobalData;

ULONG LocalData;

CapturedGlobalData = VOLATILE_READ(GlobalData, ULONG);

if (CapturedGlobalData != 0) {

    LocalData = CapturedGlobalData;

}

// Thread 2 code

if (SomeCondition != FALSE) {

    GlobalData = 0;

}

Conclusion

Race conditions in code that accesses global data are very hard to spot yet very easy to introduce. This fact is compounded by the fact that code that looks solid in a high level programming language can be rewritten, rearranged or completely removed by the compiler, rendering the code ultimately susceptible to races. Race conditions will probably continue to be a major source of bugs in code for the foreseeable future, but analyzing code for race conditions by ensuring proper locking and synchronization of data as well as more esoteric techniques like the “volatile read”, shown above, can help to reduce the surface area that these types of bugs can appear in.