Concurrency, Part 2 – Avoiding the problem

Yesterday's article on concurrency discussed the basic concepts of concurrency.  Now I'd like to start talking about how you deal with concurrency...

The first, and most important thing to realize about concurrent programming is that it's all about two things: your data and your threads.  If you only have one thread, then you don't have to worry about concurrency issues.  If you have more than one thread, then you only have to worry about concurrency issues if more than one thread can simultaneously access that data.  And that's my first principle of concurrent programming:  If your data is never accessed on more than one thread, then you don't have to worry about concurrency.   Again, the guys who get concurrency are cringing with this principle- the reality is (of course) more complicated than that, I'll get back to why it's more complicated later (I need to introduce some more concepts beforehand).

In Win32, in general, there are three ways that you can guarantee that your thread is the only one executing the data.

The first is your stack.  On Win32, the data on your stack is owned by the thread (this might not be true for other architectures, I don't know :().  Unless you explicitly pass pointers to your stack to another thread, then you don't have to worry about other threads messing with your stack data, so you don't need to worry about protecting the data.

The second way of ensuring that only one thread can access your data is to use ThreadLocalStorage, or TLS.  The idea behind TLS is that when your process starts, it allocates a "slot" in TLS.  That allocation returns you an index into a table, and you can stick whatever value you want to into that table.  When your thread starts up, you can allocate a block of memory, stick it into the table, and then, later on during the execution of the thread, you can go back and query the value of that block.  The block remains per-thread, and can be accessed without protecting the data.  This allows you to maintain per-thread context blocks which can be used to hold state that's more global than the stack.  Btw, the C runtime library allows you to declare variables in TLS by simply decorating them with __declspec(thread) - there are some caveats about using this, but the facility is available...

The third way of ensuring that only one thread can access your data is simply to be careful in how you write your code.  As an example, in my last "What's wrong with this code" article, I purposely allocated the FileCopyBlock structures in one thread, put them on a queue and executed them in worker threads.  As a result, I didn't have to protect the FileCopyBlock fields - since only one thread could ever access the data at a time, they didn't need to be protected.  Now more than one thread accessed the data (the block was constructed on the main thread and destructed on the worker threads).  But at any given time, the blocks weren't accessed by more than one thread.  This principle can be applied in a number of different ways - my example was quite simple, but it wouldn't be difficult to imagine a FSM where the state was kept in a block that was enqueued and dequeued based on state transitions - the block would only ever be accessed by one thread at a time and thus wouldn't have to be protected.


It turns out that you can write some fairly sophisticated multithreaded code without ever having to ever worry about synchronizing your shared data, just by being careful and setting up your data structures appropriately, you can do pretty amazing things.

But, of course, there are times that you can't avoid having more than one thread accessing your data.  Tomorrow, I'll talk about some of the ways around that problem.

Edit: Principal->Principle (thanks Mike :))

Comments (25)

  1. Anonymous says:

    "the C runtime library allows you to declare variables in TLS by simply decorating them with __declspec(thread)"

    I think you mean the Microsoft C Compiler, rather than the "C runtime library", as it is not a function that you can call, but rather a compiler attribute that generates code to call TLS functions automatically.

  2. Anonymous says:

    Rob, actually it’s both – the MS C compiler AND the runtime library conspire to bring that feature.

    But if you write code without the C runtime library, you can’t take advantage of it (and I work on components that don’t link with the C runtime libraries)

  3. Anonymous says:

    Being picky – you’re getting your principals and your principles mixed up again. 🙂

    Contributing to the conversation: I’m listening pretty hard. I don’t get to do a lot of concurrent work and haven’t had much training in it. Most of what I know I’ve learned from Jeff Richter’s excellent books "Programming Applications for Microsoft Windows" and "Programming Server-Side Applications for Microsoft Windows". They were written around 2000-2001, but are still hugely relevant.

    This is happening around the right time as I’m currently working on a service to adapt from a narrow-band RF hand-held computer (accessed from a base station over TCP/IP or serial cable) to our custom UDP-based application server. I aim to write it correctly this time, with much use made of asynchronous I/Os and thread pooling where possible.

  4. Anonymous says:

    Wouldn’t the notes previously linked to about the double-check locking paradigm in Java possibly also apply to the create-enqueue-dequeue paradigm you talk about? I.e, in an architecture with non-ordered writes (like x64?), you can’t be sure that the pointer you put into the queue is really filled with appropriate data.

  5. Anonymous says:

    CN: Ah – Did you notice how I queued the request? I called QueueUserWorkItem – that’s a Win32 API call that handles all the concurrency issues for me.

    I’m a firm believer of letting the OS get the concurrency issues right – they’re almost certainly more likely to get them right than I am (more on this one in a later article in the series).

  6. Anonymous says:

    Letting the OS or app framework do the right thing for you is also a good security principle – simplicity.

    Security and concurrency don’t usually come up in the same sentence, but there are significant security issues with concurrency.

    Race conditions are veterans of privilege escalation attacks and is generically called time of check – time of use (TOCTOU) attacks. The primary issue is that one thread may assume that it’s atomic to check that something is okay and then goes ahead and tries to use it, but another thread (and remember in NT, all processes are really just threads) does something shifty between the time of check and the time of use.

    Areas to check for race conditions:

    * file system calls, particularly those which check and then create files with high privilege permissions

    * temporary file handling – generally awful in my experience. This is the favoured attack under Unix, and I honestly don’t know why it’s not used more frequently under Win32 attacks

    * dealing with semaphores and WaitFor… when the event fires, and you assume you have the resource to yourself… you’re probably wrong

    This is particularly prevalent on Unix due to the setuid / setgid architecture (the NT kernel uses impersonation, which is far more flexible, or simply runs everything as LOCALSYSTEM, urgh).

    If you’re doing .NET development, read this:

    Michael Howard wrote this in 2002, and it’s still pertinent today:

    It deals with race conditions for things I hadn’t even thought about, but it also doesn’t deal with the usual suspects.

    Race conditions can also affect distributed systems, particularly those which uses broadcast load balancing. In this instance, an attacker may be able to slow down the real server making its announcements by DoSing it, and then answering quickly to the broadcast, creating a man-in-the-middle attack. Such issues have affected Microsoft’s own code in the past.



  7. Anonymous says:

    Good point Andrew.

    In fact, IIRC the original UPnP bug was a result of code that attemped to solve a concurrency issue.

  8. Anonymous says:

    Most of the points you make in this article are true in the general case, and not just C/C++ under win32. Of note: While you can’t rely on TLS existing in any specific way, most C runtimes have some sort of TLS. Most, I think, make you manage it somewhat more explicitly, though.

    I think your stack is always private — at least from the point your thread is started onward.

    Of course, this only applies at all to languages of the same basic sort as C/C++ — for example, in current Perl, /everything/ is thread-local, unless you explicitly make it otherwise. This means less worrying about concurrency issues, at the price of making starting threads very slow, and communication between threads somewhat cumbersome.

  9. Anonymous says:

    > On Win32, the data on your stack is owned by the thread (this might not be true for other architectures, I don’t know :().

    Me neither, but, yikes, how would this work? Assuming the stack stores return addresses, each thread <b>needs</b> to have a private stack or you lose coherent function calls. I think this is reliably true everywhere.

  10. Anonymous says:


    I was thinking of some RISC or mainframe-like architectures where the "stack" wasn’t really a stack in the conventional sense of the word, but instead a pointer to some kind of per task memory, and thus wasn’t necessarily shared.

    There are some really wierd computer architectures out there.

  11. Anonymous says:

    Andrew, temporary files aren’t a viable attack vector in Windows because there’s no global temporary directory, every user has its own. You can’t pollute another user’s temporary namespace, unless you’re already privileged (but there are several other shared namespaces to pollute, instead…). And in Windows you can’t forget to set O_EXCL, because CreateFile has an explicit creation disposition parameter

  12. Anonymous says:

    Also good points KJK – Clearly I’m having swiss-cheese-brain issues today, and turned of my critical thinking.

    Andrew’s points about security ARE valid – if a high privileged component that makes decisions based on external input and then creates objects with those high privileges, then there IS a potential issue.

    And temporary files CAN be issues, but, as KJK pointed out, there’s no /tmp in Windows, which mitigates many of the issues (but not all of them).

    This isn’t to say that Windows doesn’t have its own class of issues – the semaphore issue that Andrew pointed out IS real, and has been the cause of vulnerabilities in applications before – just because your call to WaitForSingleObject wakes up, if you don’t check the return code and assume that you’ve got access to the object, you’re in for a surprise.

    The bottom line is that there have been security holes that were made exploitable because of concurrency issues, and thus this needs to be considered.

  13. Anonymous says:


    Sometime in your series on concurrency, could you also cover the following:

    1) Techniques to partition your code so that concurrency issues become more apparent. It is easy to get get confused between the "object view" and the "thread view" in a program — threads weave paths through objects in a way that is not obvious at first sight when reading the code.

    2) Coding and commenting conventions that might help highlight concurrency issues. An obvious thing is to explicitly mark all shared variables in comments (variables shared across threads, that is). Perhaps also use a naming convention for these variables.

    3) Deadlock prevention techniques. (Years ago, Ruediger Asche wrote an article for MSDN that used Petri Nets to detect deadlocks. It was a bit over the top for me at the time though. Note to self: Read and understand it sometime.)



  14. Anonymous says:

    Larry, please publicize this. Apologies for offtopic, but this is VERY IMPORTANT.
    <br>&lt;a href=&quot;<a target="_new" href=";&gt;Bruce">;&gt;Bruce</a&gt; Schneier&lt;/a&gt; reports that SHA-1, a commonly used cryptographic hashing protocol, &lt;a href=&quot;<a target="_new" href=";&gt;has">;&gt;has</a&gt; reportedly been broken&lt;/a&gt; by a prestigious research team from Shanghai University. Together with recent attacks on MD5, as &lt;a href=&quot;<a target="_new" href=";tid=172&amp;tid=8&quot;&gt;previously">;tid=172&amp;tid=8&quot;&gt;previously</a&gt; covered by /.&lt;/a&gt;, we need new hashing functions as a matter of urgency, and we need them now.

  15. Anonymous says:

    I disagree with the no global temporary file issue. There are three alternatives:
    <br>* Running as the user (%tmp%) – most apps the user invokes
    <br>* Running as a unique per-service account (rare!)
    <br>* Running as a (semi-)privileged system account (LOCAL SYSTEM, LOCAL SERVICE, or NETWORK SERVICE) (%temp% = %systemroot%temp) – most services
    <br>You really want the last one for a privilege escalation attack (or just to do something interesting). Until the day Windows gives each process their own %temp% and provides strong isolation, race conditions will still be an issue.
    <br>Lastly, it’s not just temporary file handling. It’s any *shared* or *potentially* shared resource which your app uses could be useful to an attacker. If your app does not perform adequate checks (whether it is files, registry keys, semaphores, events or WaitablePorts), it may be vulnerable. There are good well known solutions to TOCTOU issues. It just takes a bit of care is all.

  16. Anonymous says:

    LocalService and NetworkService have their own profiles so they shouldn’t use %systemroot%temp.

    A lot of potential issues with %systemroot%temp are mitigated by the strong DACLs that are by default assigned to files created there. So as long as you specify CREATE_NEW when creating the file you should be fine.

  17. Anonymous says:

    Hi, Larry

    IMHO, TLS in Win32 is nearly unusable. At least in EXEs (as opposed to DLLs).

    The the problem is that in EXE it’s impossible to free data referred by TLS slot.

    TLS neither provides "destructors" similar to pthread, nor any other means to intercept thread exit. So, when, say, ExitThread() is called, there is no way to free data stored in TLS.

  18. Anonymous says:


    If it’s an EXE, then you wrote the code to create the threads, you wrote the code to signal that the threads are going to terminate, and you wrote the code that cleans up for the threads.

    Since you wrote the thread routine in the first place, why can’t you free the memory? THat’s what the C runtime library does (that’s why the C runtime library recommends/requires that you use __beginthread() – it’s to do per-thread initialization and tear down of data).

    In a DLL, you don’t get to control the threads, but in an EXE, you do.

  19. Anonymous says:

    Actually you can register callbacks to be run during thread startup or termination with EXEs. It just so happens that VC++ doesn’t support this, but it’s in the PE spec and (IIRC) LDR supports calling the callbacks at the appropriate times.

  20. Anonymous says:

    Yes, In EXE I have control over thread entry point. But when I call ExitThread(), thread dies "here and now" – without getting to the point where TLS stuff gets freed.

    Comparing PTHREADS and Win32 threading makes Win32 threading to look slightly "underdone". It concerns TLS destructors, cleanup handlers – a stuff obviously implemented via some sort of "thread exit callback". On Win32, the presence of "DLL_THREAD_ATTACH" in DllMain hints that some sort of such callback is available for "private use". And I wonder why this callback wasnt made public…

  21. Anonymous says:

    From the doc for ExitThread:

    > However, in C++ code, the thread is

    > exited before any destructors can be

    > called or any other automatic cleanup

    > can be performed. Therefore, in C++

    > code, you should return from your

    > thread function.

    It sounds like what you really want to do is have some kind of flag or IPC that tells your thread when to return from the thread proc. Then you would be able to do all your cleanup prior to dying, and you wouldn’t even need to call ExitThread.

  22. Anonymous says:

    I’m gonna add to Tom’s comment – storing return addresses on the stack is not that common, x86 does it but other architectures (if I’m not mistaken IA-64 for example) do not. It’s one of the thinks that should kill the x86 architecture since it allows for easy buffer overflow attacks but it doesn’t look that likely anymore, now with x86-64.

Skip to main content