What is this thing you call "thread safe"?


Caveat: I am not an expert on multi-threading programming. In fact, I wouldn’t even say that I am competent at it. My whole career, I’ve needed to write code to spin up a secondary worker thread probably less than half a dozen times. So take everything I say on the subject with some skepticism.

A question I’m frequently asked: “is this code thread safe?” To answer the question, clearly we need to know what “thread safe” means.

But before we get into that, there’s something I want to clear up first. A question I am less frequently asked is “Eric, why does Michelle Pfeiffer always look so good in photographs?” To help answer this pressing question, I consulted Wikipedia:

“A photogenic subject is a subject that usually appears physically attractive or striking in photographs.”

Why does Michelle Pfeiffer always look so good in photographs? Because she’s photogenic. Obviously.

Well, I’m glad we’ve cleared up that mystery, but I seem to have wandered somehwat from the subject at hand. Wikipedia is just as helpful in defining thread safety:

 “A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads.”

As with photogenicity, this is obvious question-begging. When we ask “is this code thread safe?” all we are really asking is “is this code correct when called in a particular manner?” So how do we determine if the code is correct? We haven’t actually explained anything here.

Wikipedia goes on:

“In particular, it must satisfy the need for multiple threads to access the same shared data, …”

This seems fair; this scenario is almost always what people mean when they talk about thread safety. But then:

“…and the need for a shared piece of data to be accessed by only one thread at any given time.”

Now we’re talking about techniques for creating thread safety, not defining what thread safety means. Locking data so that it can only be accessed by one thread at a time is just one possible technique for creating thread safety; it is not itself the definition of thread safety.

My point is not that the definition is wrong; as informal definitions of thread safety go, this one is not terrible. Rather, my point is that the definition indicates that the concept itself is completely vague and essentially means nothing more than “behaves correctly in some situations”. Therefore, when I’m asked “is this code thread safe?” I always have to push back and ask “what are the exact threading scenarios you are concerned about?” and “exactly what is correct behaviour of the object in every one of those scenarios?”

Communication problems arise when people with different answers to those questions try to communicate about thread safety. For example, suppose I told you that I have a “threadsafe mutable queue” that you can use in your program. You then cheerfully write the following code that runs on one thread while another thread is busy adding and removing items from the mutable queue:

if (!queue.IsEmpty) Console.WriteLine(queue.Peek());

Your code then crashes when the Peek throws a QueueEmptyException. What is going on here? I said this thing was thread safe, and yet your code is crashing in a multi-threaded scenario.

When I said “the queue is threadsafe” I meant that the queue maintains its internal state consistently no matter what sequence of individual operations are happening on other threads. But I did not mean that you can use my queue in any scenario that requires logical consistency maintained across multiple operations in a sequence. In short, my opinion of “correct behaviour” and your opinion of the same differed because what we thought of as the relevant scenario was completely different. I care only about not crashing, but you care about being able to reason logically about the information returned from each method call.

In this example, you and I are probably talking about different kinds of thread safety. Thread safety of mutable data structures is usually all about ensuring that the operations on the shared data always operate on the most up-to-date state of the shared data as it mutates, even if that means that a particular combination of operations appears to be logically inconsistent, as in our example above. Thread safety of immutable data structures is all about ensuring that use of the data across all operations is logically consistent, at the expense of the fact that you’re looking at an immutable snapshot that might be out-of-date.

The problem here is that the choice about whether to access the first element or not is based on “stale” data. Designing a truly thread-safe mutable data structure in a world where nothing is allowed to be stale can be very difficult. Consider what you’d have to do in order to make the “Peek” operation above actually threadsafe. You’d need a new method:

if (!queue.Peek(out first)) Console.WriteLine(first);

Is this “thread safe”? It certainly seems better. But what if after the Peek, a different thread dequeues the queue? Now you’re not crashing, but you’ve changed the behaviour of the previous program considerably. In the previous program, if, after the test there was a dequeue on another thread that changed what the first element was, then you’d either crash or print out the up-to-date first element in the queue. Now you’re printing out a stale first element. Is that correct? Not if we always want to operate on up-to-date data!

But wait a moment — actually, the previous version of the code had this problem as well. What if the dequeue on the other thread happened after the call to Peek succeeded but before the Console.WriteLine call executed? Again, you could be printing out stale data.

What if you want to ensure that you are always printing out up-to-date data? What you really need to make this threadsafe is:

queue.DoSomethingToHead(first=>{Console.WriteLine(first);});

Now the queue author and the queue user agree on what the relevant scenarios are, so this is truly threadsafe. Right?

Except… there could be something super-complicated in that delegate. What if whatever is in the delegate happens to cause an event that triggers code to run on another thread, which in turn causes some queue operation to run, which in turn blocks in such a manner that we’ve produced a deadlock? Is a deadlock “correct behaviour”? And if not, is this method truly “safe”?

Yuck.

By now you take my point I’m sure. As I pointed out earlier, it is unhelpful to say that a building or a hunk of code is “secure” without somehow communicating which threats the utilized security mechanism are and are not proof against. Similarly, it is unhelpful to say that code is “thread safe” without somehow communicating what undesirable behaviors the utilized thread safety mechanisms do and do not prevent. “Thread safety” is nothing more nor less than a code contract, like any other code contract. You agree to talk to an object in a particular manner, and it agrees to give you correct results if you do so; working out exactly what that manner is, and what the correct responses are, is a potentially tough problem.

************

(*) Yes, I’m aware that if I think something on Wikipedia is wrong, I can change it. There are two reasons why I should not do so. First, as I’ve already stated I’m not an expert in this area; I leave it to the experts to sort out amongst themselves what the right thing to say here is. And second, my point is not that the Wikipedia page is wrong, but rather that it illustrates that the term itself is vague by nature.

 

Comments (30)

  1. Steve Bjorg says:

    It can get even more complicated.  In our implementation of the work-stealing queue (based on the excellent work of Danny Hendler, et al), we have thread-safety in one method only, namely TrySteal().  The other methods, Push() and TryPop() are by-design always called from the same thread.  Calling them from different threads would be a violation of the contract.  However, TrySteal() can be called by a number of different threads and not corrupt the state of the queue.  What’s even more amazing is that the implementation achieves this without any locks or loops!

    For those curious, the C# implementation of LockFreeWorkStealingDeque is available under Apache 2 in MindTouch Dream.

  2. Vish says:

    Wouldn’t putting access to data structures that mutate in a critical section(lock) be better? Thread safe (simultaneous access by many threads) should only be for code that dont handle shared data structures…

  3. Aaargh! says:

    In this particular example, the queue is perfectly thread safe, the code using the queue isn’t. The queue is only responsible for itself, the fact that you’re doing multi-threaded operations on the queue implies that any information you get from the queue can be out of date a few moments later.

    Your example code does not take into account the fact that the queue can be modified at any time by other threads, and your code is thus not thread safe. If you need multiple calls to the queue to be consistent, you have an unguarded critical path in _your code_ not in the queue’s, and it’s your responsibility to guard it. In this case it could be solved by synchronizing on the queue.

  4. Aaron Friel says:

    When you reference Wikipedia, especially by link as you have done, it’s especially important to use the "Permanent Link" button in the bottom left. That allows you to permanently link to the same version of the article that you saw.

    Otherwise what you’re linking to could be vandalized (popular articles frequently are, though they are corrected that much more quickly,) and your users could see something other than what you intended, or problems that you point out could be corrected, and your point is moot. 🙂

  5. Frank Booth says:

    > As with photogenicity, this is obvious question-begging.

    How is this question-begging?

    Question begging does not mean “to avoid providing the answer to a question” it means that a new question needs to be answered as a result of the answer to the original question.

    That is all.

    No, it doesn’t mean that. Let me give you another example of question begging.

    Suppose I asked “why is diamond hard but butter is soft?” and you answered “diamond and butter are both made out of atoms; the atoms of diamonds are hard and the atoms of butter are soft.” You would have begged the question; your answer to my question “why are some things hard and some things soft” is “because some things are made out of stuff that is hard and some things are made out of stuff that is soft” — that is, you’ve avoided answering the question by providing an “explanation” that itself cannot be understood without answering the original question — namely, why is some stuff hard and some stuff soft? This pseudo-explanation has no predictive power; it doesn’t tell us anything new, it just circles back on itself. 

    A non-question-begging answer would be “diamond and butter are both made of atoms; the atoms of diamond are all identical and arranged in a stable, rigid lattice where every point in the lattice is reinforced by a strong bond to four other points. The atoms of butter are a disorganized collection of many different atoms that hold weakly to each other. It takes only a small force to disrupt the loose arrangement of butter atoms but a very large force to disrupt the strong arrangement of diamond atoms.”

    Now, this explanation does *raise* more questions. It raises questions like “why are some lattices strong and some weak?” and “why are some objects composed of many different kinds of atoms, and some composed of just one atom?” Question-begging is not the act of raising more questions. Every explanation raises more questions. This particular explanation is testable, and has predictive power; we can investigate the hardness or softness of other substances, and make predictions about what sorts of atomic structures they will have — or, vice versa, we can look at an atomic structure and try to figure out from it how hard the substance will be.

    My point here is that “because she’s photogenic” is question-begging. Why does she look so good? Because she’s photogenic. Why is she photogenic? Because she looks so good. We have learned nothing about photogenicity (or Ms. Pfeiffer).

    Similarly, if you ask “why is this code thread-safe?” and the answer is “because it can be correctly called on multiple threads”, you’ve begged the question. Why is it thread-safe? Because it’s correct. Why is it correct? Because it’s thread-safe. Again, we have learned nothing about the nature of thread safety.

    — Eric

  6. Aaargh! says:

    > Question begging does not mean "to avoid providing the answer to a question"

    Correct, it does not mean that.

    > it means that a new question needs to be answered as a result of the answer to the original question.

    It doesn’t mean that either.

    The way it’s used in the article is correct, see for more information: http://en.wikipedia.org/wiki/Begging_the_question

  7. Dan Diplo says:

    This raises the bigger question – is Michelle Pfeiffer  thread safe?

  8. McAravey says:

    >In this particular example, the queue is perfectly thread safe, the code using the queue isn’t.

    I am inclined to agree. I think that trying to enforce some sort of "thread-safe" contract on the caller is too much work. I was just thinking of the LINQ Select where the selector takes anything, and gives you anything. I could make my selector throw and exception, but I don’t think that is the fault of the Select statement.

    I see it as an implicit contract that the caller of Select will not pass in an invalid function (or one with exceptions, etc…), and if the caller does pass in garbage, then it is the responsibility of caller to catch and handle the exception.

    Now to turn this around on the "thread-safe" issue. I don’t believe stateless method calls like Select should worry about anything but what they are designed to do. Even if the selector has state wrapped in a closure, I don’t think that is something Select has to worry about.

    The wrench in the works is shared state like a queue, and I don’t have an answer for that one.

  9. Nym says:

    Hey throw some shared memory in there for yet more fun.

  10. John Carter says:

    I’d correct the Wikipedia, except the wikipedia doesn’t allow you to know anything, it merely allows you to paraphrase other references. The definition in the wikipedia as it was at the time of that article is patently wrong.

    “A piece of code is thread-safe if it functions correctly during simultaneous execution by multiple threads.”

    That’s called “thread lucky”, not “thread safe”.

    A piece of code is thread safe if it

    a) _always_ functions according to spec irrespective

    b) of the number of threads running,

    c) or the number of threads invoking that code simultaneously,

    d) or the order in which the code is invoked by those threads,

    e) or the timing and sequence of external events that may trigger a thread context switch.

    That’s what Thread Safety is… but it is incredibly hard to _prove_ anything is thread safe!

    Note this is subtly different from multi-processor safe or scheduler safe, which are somewhat harsher requirements. (However I consider it Bad Form to code to the assumption a particular scheduler or number of CPU’s!)

    Alas, the Queue class _is_ thread safe. End of Story.

    The initial code examples you give using the Queue class isn’t. Using Thread unsafe code without appropriately serializing access to it makes your code unsafe.However, using Thread Safe code does _not_ make your code Thread safe. So going back to the “meaning of Thread Safety” and what to do about the fact proving Thread Safety is so hard…

    Instead of trying to prove code is thread safe… we can either design our code to be so simple, that it is obviously thread safe.

    Or we, as is usual for this industry, design it so complex that there are no obvious thread safety issues…. and then rely on inspection to round up the usual suspects…

    The most usual causes of thread unsafety come down to one of…

    * Thread races. This is the problem the blogger had. A thread race on access to the shared queue resource.

    * Deadlocks.

    * Caching invalidation (Usually the failure to use the volatile in the rather rare cases where needed.)

    * Priority Inversion.

    * Starvation.

    While that may be a shortish list… the ways in which they may occur are

    legion and incredibly subtle.

    Why? Thread safety is a global property of a system, not a local property. (Although it is possible to code in such a manner that it _is_ a local property. In fact I heartily recommend _always_ coding that way for many other Good Reasons!)

  11. Rob McCready says:

    I disagree. The fact that thread-safe code can be invoked by broken code does not mean that the term “thread-safe” is meaningless.

    I didn’t say it was meaningless. I said it was vague. — Eric

    The term “thread-safe” is perfectly well-defined. It means that the code in question will operate correctly regardless of the number and relative timing of threads calling into it.

    And now you are re-stating my point; I’m glad we agree. “Thread safe” means “correct”. What does the entirely vague “correct” mean? That depends on what the code is supposed to do in a particular situation. We haven’t said anything new. — Eric

    It specifically does /not/ imply that any external code that invokes the thread-safe code is similarly guaranteed to operate correctly. Obviously it cannot provide any such guarantee, as the author has no control over the external code . The examples given in the post are thus completely beside the point. They in no way show that the term “thread-safe” is meaningless. They only show that it is possible to write broken code that uses thread-safe code.

    Obviously, the author of a piece of code is responsible for documenting how the code works: if Queue.Peek() will throw on an empty queue, this must be documented. And, of course, a thread-safe queue is more useful if it provides an atomic Peek() such as was suggested. But the bottom line is that it is not the author’s responsibility if a user of the code is not sufficiently thread-savvy to know that calling IsEmpty and Peek sequentially introduces a hazard. That’s Threading 101.

    Apparently we are in violent agreement. My whole point was that simply saying that an object is “thread-safe” tells you almost nothing about how to correctly use that object. Rather, the object must be extensively documented so that its exact contract can be stated. — Eric

  12. Denis says:

    Well, I cannot, obviously, speak for all people who ask whether or not a piece of code is “thread-safe”, but what I mean by this question is, basically, “can I be sure that this code will have no surprises for me when I execute if from within multiple threads, or, more specifically, does each thread that runs this code have to worry about the other threads, or can it ‘believe it’s the one and only’, so to speak?”

    Two points. First, consider my example of a queue that you can ask if its empty, but then immediately peeking it throws an exception. It certainly wouldn’t do that if there were only one thread. So for you, is this behaviour surprising? And is the object therefore not “thread-safe”?

    Second, I note that “Denis finds the object’s behaviour unsurprising” is not a particularly useful definition of “thread safe”. (If you’d like, I can simply forward all the questions I get about thread safety to you, so that you can tell people whether you’re surprised or not.) — Eric

    And, personally, I won’t even insist on the code being “correct” (having no bugs or gotchas); I would just like to know that those bugs and gotchas are the same (as are the ways to work around them), no matter how many threads run the thing.

    In short, “thread-safe” means, to me, “don’t have to worry about multi-threading: don’t need any special care, any additional code to manage it, and so on”.

    Of course, this is a highly subjective definition: I would, for example, define “financial security” as “don’t have to worry about money: never so little as to think where to find more, and never so much as to think what to do with it; just enough to never think about it”. Not many people will agree with that. 🙂

    But, back to the subject, my definition of thread-safety would work well for any server-side web application: no matter how many people hammer the site, the code behaves the same (well, maybe, a little slower at the end), full stop. Not such a bad thing, is it?

  13. Matthew Hannigan says:

    RogueWave/Sun define the terms mt-hot vs mt-safe:

    http://www.roguewave.com/kb/index.php?View=entry&EntryID=1169

  14. JonB says:

    "Thread safe" is a bit of a C throw-back, where you can determine if a particular function is thread safe if it isn’t dependent on any stored state.  However an object could only be deemed thread safe in the same manner if it didn’t maintain any state, which wouldn’t be much of an object.

  15. Alun Harford says:

    "Thread-safe" is not at all vague.

    It’s a statement that if each thread’s rely condition is satisfied then that thread satisfies its guarantee condition and each thread’s guarantee condition implies the others’ rely conditions.

    It’s only vague when we choose to rely on intuition rather than mathematics when we describe a system.

  16. Joe says:

    If something is thread safe, say an API, it can be used by several threads at the same time without the caller having to think about it. This is as old as computers.  It is not something you can wave your hand at and say is vague and then ignore. If something is not thread safe, you should only use it from a single thread, or it is likely to crash, hang, or just silently go wrong. If you want to develop in a thread safe manner, and you do, you must think defensively, and at very least do a global lock so in your code, things are only happening single threaded. Later on, carefully break stuff up into separate locks, then maybe use read/write locks. Yes it’s harder, and there are many traps, but there is many off the shelf tools and techniques, plus most of computer history, to help you. That or develop in a language that does it for you, but that will always be more limited than a manual environment. Just like with cars and gears, all racing cars are manual and you can’t get the same miles per gallon without manual. Your fooling yourself if you think automatic is as good.

  17. Carl Daniel says:

    I have to agree with Eric on this one.  While there are good definitions of "thread safe", there is no universal agreement on which of the many good definitions is "correct".  Correctness, of course, depends on the circumstances.  As a result, the term "thread safe" with no further qualification is vague at best, dangerously misleading at worst.

    Thread safety is somewhat analogous to exception safety.  The C++ community has settled on a multi-tier defintion of "exception safe" – I would propse that a similar family of thread safety guarantees would be a useful addition to the dialog.  The mathematical defintion that Alun provided above sounds like a good candiate for being "the strong guarantee".  At another extreme, a class that’s documented as providing a single method that can be invoked by less than 3 threads under specific circumstances would be an example of a very weak guarantee.

    The problem with the very stong guarantee that Alun provided – just like the stong exception safety guarantee in C++ – is that most cases don’t require a guarantee that strong, and generally speaking, providing such a strong guarantee is more difficult and less efficient (naturally, there are always exceptions).

  18. Adam V says:

    I’m hoping (fingers crossed!) that the inclusion of the "Michelle Pfeiffer" tag means that Ms. Pfeiffer will make additional appearances in future examples. I for one welcome our new photogenic example overlords.

  19. Rob McCready says:

    Hi Eric,

    “My whole point was that simply saying that an object is “thread-safe” tells you almost nothing about how to correctly use that object.”

    No. It tells you something very, very important about how to correctly use that object: It tells you that you do not have to restrict the number or relative timing of threads that call into it for the /rest/ of the documentation to hold. This is a non-trivial piece of knowledge, and the meaning is clearly defined. I can document up my code as much as I like, but if I don’t include the information that the code is thread-safe then any user must assume that it will all go to hell if they allow multiple threads to access with arbitrary timings.

    “Thread safe” means “correct”.

    No, it doesn’t. It means that whatever the code has been described to do, it will continue to do it properly regardless of the number and relative timing of threads that call it. Clearly, what “correct” means depends on what the code is intended to do, and clearly this must be defined. But that point is orthogoal to thread-safety; /all/ code must be documented so that the user knows what it is supposed to do. The fact that code must be documented before it is useable does not diminish the usefulness of the term “thread-safe” as part of that documentation.

    “Rather, the object must be extensively documented so that its exact contract can be stated.”

    /Any/ code must be extensively documented so that its exact contract can be stated. “Thread-safe” is a useful part of that documentation.

    The problem here is that you are proposing a broader meaning of “thread-safe” than has ever existed, and then pointing out that your broader proposed meaning is too broad to be meaningful. Well, yes; your proposed meaning /is/ way too broad. But the meaning of “thread-safe” has always been much more narrow and specific. Usefully so. “Thread-safe” means the code will work as documented regardless of the number and relative timing of the calling threads.

    “Thread-safe” does /not/ mean that the code will help solve whatever threading challenges your application has. It does not mean, and has never meant, that it will meet any and all needs for consistent presentation of data to multiple threads. It does not mean that it will save you from having to figure out how to solve those problems yourself. How could it possibly mean that?

    Look, you and I agree on what “thread safe” typically means in typical documentation. You don’t need to convince me of that. (Whether what it means is useful is another question, which I’ll come to in a moment.)

    My point, which you seem to both be doing an admirable job of forcefully supporting, and yet at the same time completely missing, is that when I am asked by a customer “is this code thread safe?” nine times out of ten, they have a definition of “thread-safe” in their head that you would describe as “not at all the real meaning of thread-safe”. Whether a precise mathematical definition exists or not is irrelevant; the way the term is used in practice by customers who ask me questions is completely vague. Which is why I call out that when I’m asked whether code is “thread safe” I have to immediately push back in order to determine exactly what the customer believes “thread safe” means. Because they almost certainly mean what Denis means above, not what you and I mean, and certainly not what our mathematically-inclined friend above means.- Eric

     

     

    Every single one of your examples involves problems in, or requirements of, the calling code, and how the Queue does not solve or meet these. “Thread-safe” has never been about any of these sorts of requirements. It has only ever been about /one/ thing: whether or not the code works as documented when called by multiple threads.

    You say that code should say what specific security threats it is armoured against rather that just saying it is “secure”. Well, saying code is “thread-safe” is doing /exactly/ what you want; it is stating a particular /and very specific/ capability of the code. It is saying that whatever the code is documented to do, it will do so correctly regardless of the number and relative timing of threads calling into it.

    In summary: “thread-safe” does /not/ mean “correct”. It does /not/ mean “useful in your situation”. It does /not/ mean “provides the consistent view of shared data that you need”. Yes indeed, all of those things must be described by the rest of the documentation. Then the term “thread-safe” lets you know that all of that rest of the documentation will continue to hold in the presence of multiple threads.

    Now, all that said, I think you’ve overstated the case somewhat. Let’s take my naive threadsafe queue as an example. Suppose the documentation for IsEmpty() says that it “This method of the threadsafe queue returns true if the queue is empty and false otherwise”. Is this documentation accurate? Does it actually do that in a multithreaded situation? No, it does not. It returns true if the queue has ever been empty in the past, and false if the queue has ever been non-empty in the past. If both conditions are met, which one you get depends on timings of other operations on other threads. That’s very different! If you want to know what the queue is NOW, you have to put a lock around every access to the queue.

    Now, the documentation should probably say that (and the method should probably be called “MightOrMightNotHaveBeenEmptyAtSomePointInThePast()” What conclusion could we reach other than “if you use IsEmpty then every access to this allegedly-threadsafe queue needs to be synchronized, exactly as though it were not threadsafe in the first place“?

    Does a queue that needs to be globally synchronized in order for its most basic methods to be used predictably seem “threadsafe” to you? Maybe to you it does. To most of the customers who ask me questions like this? Certainly not.

    Another example. Consider my recent post on the thread safety of event delegate invocations. A lot of people ask “is this event invocation code threadsafe?” Are they asking “does this code never dereference null?” or are they asking “does this code never invoke a previously-removed event handler?” Both seem like perfectly reasonable interpretations of “is this code threadsafe?” but the answer to one of those questions is “yes” and the answer to the other is “no”, so it is rather important to know which one they’re asking about.

    Most of the time when people say that their code is “threadsafe” they mean that it happens to have the almost completely useless property that it maintains internal consistency robustly in the face of arbitrary thread timings. Though I suppose that’s a nice property to have, does it really matter whether an object has that property if in order to use any of its basic methods, you’re going to have to synchronize access to it exactly as though it did not armor its internals against race conditions?

    Basically, what you’re saying is that a non-threadsafe object is an object that has completely undefined behaviour when called from multiple threads; if you want defined behaviour, you have to synchronize access to it. A threadsafe object, by contrast, has defined behaviour, and that behaviour is defined to be inconsistent and timing-dependent. If you want consistent behaviour then again, you have to synchronize access, exactly the same as if it had been non-threadsafe in the first place. Frankly, the difference between “undefined” and merely “inconsistent” seems pretty weak, hardly worth the effort of making the object threadsafe in the first place if every consumer of the object is going to need to synchronize access to it anyway. — Eric

  20. eff Five says:

    Eric:

    If I said the that "this code is thread un-safe" would it be less vague than "is this code thread safe"?

  21. Larry says:

    I don’t think the assertion of thread safety tells us nothing, or even almost nothing. I’ve always interpreted "thread-safe" to assert the *minimal* contract of thread-safety: instance separation (no mutable static state) and method/property atomicity; operations on a shared instance in multiple threads will behave the same way and leave the instance in a the same state at sequence points as the same operations called in an unpredictable, arbitrary order on one thread. In other words, if one thread calls obj.A () and another thread calls obj.B (), then those calls will be equivalent to calling obj.A (); obj.B (); or obj.B (); obj.A () in one thread. I never have to lock an thread-safe object to do a single call; I have to lock it (or do something special) only when I want to be sure of the *order* of calls.

    As you note, this is not a very strong contract, and I’m not at all surprised that some people might believe "thread safety" asserts a stronger contract. But minimal thread-safety is still a non-trivial contract.

  22. Grant Husbands says:

    I agree with Larry, and I’ll add that a lack of thread-safety doesn’t just affect ordering. Code that isn’t thread-safe being called from multiple threads easily results in state corruption, seemingly arbitrary exceptions and memory corruption (not in all languages). When you call code that’s simply "thread-safe", you can expect behaviour approximately matching that of the contract.

    If there’s a thread-safe queue with TryPop, I know I can use that from multiple threads without breaking anything and that each call will get a uniquely-added entry (or false,null). However, Peek will obviously always be meaningless in such circumstances.

    This does all make me think, though; there may be some good mileage in having thread-safe libraries go out of their way to induce failure in multi-threaded callers, in an appropriate debug mode. Many queue methods, for example, could delay a little, in hope of returning an unexpected null to a caller or returning the same result simultaneously to two callers (where Peek is involved).

    Hopefully, more people will start using multi-threading idioms other than shared-state-with-locks, and much of the problem will be reduced.

  23. Filini says:

    @Dan: "This raises the bigger question – is Michelle Pfeiffer  thread safe? "

    Michelle Pfeiffer can do whatever she wants to my threads 🙂

    (sorry for the OT, I couldn’t resist)

  24. cpun says:

    I would define a method to be thread-safe if atleast one of these is true:

    1. it is a pure function

    2. For functions that read and/or write mutable shared state, if the class invariants can NOT be violated as a consequence of the number, timing and/or interleaving of threads executing this very same method concurrently.

    Eric, I would say that you are taking on a whole different animal in your example by talking about thread-safety at the level of conducting multiple operations on an object whose state can be modified by different threads. Thread-safe constructs as they are used now do NOT compose (EVER). That is to say in:

    if (!queue.IsEmpty) Console.WriteLine(queue.Peek());

    I might take and release a lock in IsEmpty and do the same in Peek(), but the outcome of composing these into the above statement is NOT thread safe. This is, I would argue,  not a knock on the thread-safety of the member functions of Queue at all. You could easily have a TryPeekAndPrint() call on Queue that does the above operation in a thread-safe manner by composing all the other operations.

    This is pretty much where Software Transactional Memory comes in since you want to make sure that the outcome of doing IsEmpty and Peek is not influenced by other threads or transient behavior.

  25. jerome says:

    There’s exactly the same remark in Java Concurrency in Practice, section 2.1…

  26. Rob McCready says:

    Hi Eric,

    "My point, which you seem to both be doing an admirable job of forcefully supporting, and yet at the same time completely missing, is that when I am asked by a customer "is this code thread safe?" nine times out of ten, they have a definition of "thread-safe" in their head that you would describe as "not at all the real meaning of thread-safe"."

    Are you really going to propose that the meaningfulness or usefulness of a technical term depends on how well customers understand it? Really? How many meaningful or useful technical terms would we have left if that were the standard?

    Look: the fact that a customer (or you, or any of the hypothetical other people you mention) might misunderstand what "thread-safe" means does not mean it is meaningless or useless. Multi-threading is /hard/. It is hard to understand the issues, and it is hard to get right even when using thread-safe components. The fact that one has to have a good understanding of threading to write correct mult-threaded code does /not/ mean that "thread-safe" is meaningless.

    "Suppose the documentation for IsEmpty() says that it "This method of the threadsafe queue returns true if the queue is empty and false otherwise". Is this documentation accurate? Does it actually do that in a multithreaded situation? No, it does not. It returns true if the queue has ever been empty in the past, and false if the queue has ever been non-empty in the past. If both conditions are met, which one you get depends on timings of other operations on other threads. That’s very different! If you want to know what the queue is NOW, you have to put a lock around every access to the queue."

    Anyone who understood multithreading would know this immediately, without having to be told. Someone who didn’t definitely might not. This does /not/ show that the term "thread-safe" is meaningless. It shows that people who do not understand threading issues will likely not understand what "thread-safe" really means. In other shocking news, people who do not understand quantum mechanics will not likely understand what "superposition" really means.

    To be blunt (but not unkind), I think your surprise/outrage at how useless IsEmpty would be in a multithreaded scenario says much more about your experience with multithreading than anything else. It is, as I said before, Threading 101. Most people haven’t taken Threading 101. Their misunderstanding of threading terms does not say much about the usefulness or meaningfulness of those terms.

    I’ve written many, many thread-safe queues. None of them have an IsEmpty that is intended to be used when anything else could be accessing the thread at the same time. Neither do they have a Peek(). They have a "bool TryDequeue(out T value)", and even with /that/ one has to be very careful to make sure that whatever thread is servicing the queue won’t end up stranding some entry that was put in right after the last time they checked and got back a "false". So then one adds an Event or a "bool WaitPop(Timespan wait, out T value)", or both, to /help/ the calling code do things correctly. Not /guarantee/. /Help/. There is /nothing/ I can do to make sure someone understands for sure how to use my queue correctly. And even if my queue only had useless methods like IsEmpty and Peek, I could /still/ call it thread-safe as long as it wouldn’t become internally inconsistent when two threads called into it at the same time. Thread-safe, yes. Useful, no. Thread-safe does /not/ mean "useful" or "correct". It means "thread-safe".

    "Most of the time when people say that their code is "threadsafe" they mean that it happens to have the almost completely useless property that it maintains internal consistency robustly in the face of arbitrary thread timings."

    Eric, the fact that you consider this property to be "almost completely useless’ indicates to me that you can’t possibly have had to write much multithreaded code. Good thread-safe components are gold. Try to write a good Queue that helps one thread pass information to another, and you’ll see.

    "A threadsafe object, by contrast, has defined behaviour, and that behaviour is defined to be inconsistent and timing-dependent. If you want consistent behaviour then again, you have to synchronize access, exactly the same as if it had been non-threadsafe in the first place."

    No, no, no. Please take your IsEmpty example and erase it from your mind. Nobody with any competence writes a thread-safe queue with an IsEmpty that is intended to be used when multiple threads could be running. Nobody with any competence at multi-threading would ever attempt to use IsEmpty when multiple threads are running. All you are demonstrating is that it is possible to be incompetent at mult-threading, and thread-safe components can’t save you from that. This is not a surprise, and it does not advance your argument at all.

    Take my Queue. If one or more threads want to push information to one or more other threads, all the supplier threads have to do is call Queue.Enqueue(T value), and all the listener threads need to do is call WaitDequeue() with some timeout (so they can periodically check other things, including whether they should exit). Done! Information safely gets from suppliers to listeners, with no other synchronization necessary.

    Now, imagine the queue wasn’t thread-safe! The whole thing becomes useless. You can’t just wrap all access in an external lock; the suppliers wouldn’t be able to call Enqueue when a listener was blocked in WaitDequeue. Thread-safety makes the queue useful.

    Eric, I will be blunt once again (and once again, not unkindly): I think you might want to consider whether your admitted lack of experience with multi-threading is the real issue here.

    It is completely unsurprising that IsEmpty is useless in a multi-threaded scenario. It is completely unsurprising that any Queue that fails to expose an atomic check-and-dequeue will be useless regardless of thread-safety. It is completely unsurprising that even useful thread-safe components do not absolve the user of worrying about their own code, and issues like deadlocks. Multi-threading is hard, and the fact that thread-safe components fail to make it as easy as single-threading does not mean that thread-safe components are useless, or that the term "thread-safe" is badly defined.

    My final word on this: do some serious multi-threaded coding, write some useful thread-safe components, and then we’ll see how "almost completely useless" you think the term and the compoents are.

  27. Otatiaro says:

    Hi,

    Imagine I have some code, for which I give the specifications : "This is a simple addition of two integers, but if more than one thread access it at a time, it will completely explose and burn your computer".

    Then I run it, create 2 threads and execute my code … it explodes and burn my computer.

    This is the "correct" behavior (as stated in the specifications) and then my code is thread-safe !

    That’s why coders should never write specs 😉

    Bye!

  28. 0x60de says:

    I think nobody should ever attest to an "object" being thread-safe. you may have stateful behaviour or may have stateless method calls (each method requires all parameters to be operate on and does not maintain state).

    The most we could attest to is an "operation" is thread-safe. Once you follow this definition, you may alternatively attest that "each" operations in Queue class is thread-safe.

  29. Lord Dust says:

    ""Thread-safe" has never been about any of these sorts of requirements. It has only ever been about /one/ thing: whether or not the code works as documented when called by multiple threads."

    That’s pretty straighforward. "Does the code work as documented?" Simple.

    "Information safely gets from suppliers to listeners, with no other synchronization necessary.

    Now, imagine the queue wasn’t thread-safe! The whole thing becomes useless."

    I am imagining receiving your component with no documentation. By your definition above, since I have no documentation, the component cannot behave as documented, and is therefore not thread-safe. Strangely, I go ahead and use the component and, being well-designed, it is not useless. It’s a great piece of code, and I find it indispensable, because it is, in fact, thread-safe, regardless of any documentation or lack thereof. Clearly, the definition of thread-safety revolving around the documentation is not helpful.

    Also, I believe the small aside concerning the exact definition of "begging the question" is not merely coincidental. It clearly illustrates that, however well documented the original use of some term may be, that definition is vague and useless when faced with overwhelming popular opinion. In similar fashion, because there are likely to be far more consumers of your code than creators of it(!), you might find that catering to the popular definition is of more use and promotes more clear communication. Certainly, not caring what definition of terms is popular amongst the majority of a product’s customer base will neither promote that product, nor help people to use it correctly if use of the term is critical to the use of the product.

    To be blunt (but not unkind), I think your surprise/outrage at how little people understand the term (particularly in the face of your less than stellar definition) says much more about your experience with large groups of code consumers than anything else. It’s Customer Relations 101. Most people haven’t taken Customer Relations 101 (particularly customer service agents!). Their misunderstanding of threading terms says much about the usefulness, if not meaningfulness, of those terms.

  30. Shawn says:

    If you updated wikipedia you'ed need to re-write your opening analogy. You wouldn't want to be referenceing stale data.