Why are thread safe collections so hard?


Writing a collection which is mutable, thread safe and usable is an extremely difficult process.  At least that’s what you’ve likely been told all through your schooling.  But then you get out on the web and see a multitude of thread safe lists, maps and queues.  If it’s so hard, why are there so many examples? 

The problem is there are several levels of thread safe collections.  I find that when most people say thread safe collection what they really mean “a collection that will not be corrupted when modified and accessed from multiple threads”.   Lets call this “data thread safe” for brevity.  This type of collection is rather easy to build.  Virtually any collection that is not thread safe can be made “data thread safe” by synchronizing access via a simple locking mechanism. 

For Example, lets build a data thread safe List<T>. 

public sealed class ThreadSafeList<T> : IEnumerable<T> {
private List<T> m_list = new List<T>();
private object m_lock = new object();

public void Add(T value) {
lock (m_lock) {
m_list.Add(value);
}
}
public void Remove(T value) {
lock (m_lock) {
m_list.Remove(value);
}
}
public bool Contains(T value) {
lock (m_lock) {
return m_list.Contains(value);
}
}
public int Count { get { lock (m_lock) { return m_list.Count; } } }
public T this[int index] {
get { lock (m_lock) { return m_list[index]; } }
set { lock (m_lock) { m_list[index] = value; } }
}
// IEnumerable<T> left out
}

And there you have it.  The lock statement prevents concurrent access from multiple threads.  So the actual m_list instance is only ever accessed by a single thread at a time which is all it’s designed to do.  This means instances of ThreadSafeList<T> can be used from any thread without fear of corrupting the underlying data. 

But if building a data thread safe list is so easy, why doesn’t Microsoft add these standard collections in the framework?

Answer: ThreadSafeList<T> is a virtually unusable class because the design leads you down the path to bad code.

The flaws in this design are not apparent until you examine how lists are commonly used.  For example,  take the following code which attempts to grab the first element out of the list if there is one. 

static int GetFirstOrDefault(ThreadSafeList<int> list) {
if (list.Count > 0) {
return list[0];
}
return 0;
}

This code is a classic race condition.  Consider the case where there is only one element in the list.  If another thread removes that element in between the if statement and the return statement, the return statement will throw an exception because it’s trying to access an invalid index in the list.  Even though ThreadSafeList<T> is data thread safe, there is nothing guaranteeing the validity of a return value of one call across the next call to the same object.  

I refer to procedures like Count as decision procedures.  They server only to allow you to make a decision about the underlying object.  Decision procedures on a concurrent object are virtually useless.  As soon as the decision is returned, you must assume the object has changed and hence you cannot use the result to take any action.

Decision procedures are one of the reasons why Immutable Collections are so attractive.  They are both data thread safe and allow you to reason about there APIs.  Immutable Collections don’t change ever.  Hence it’s perfectly ok to have decision procedures on them because the result won’t get invalidated.

The fundamental issue with ThreadSafeList<T> is it is designed to act like a List<T>.  Yet List<T> is not designed for concurrent access.   When building a mutable concurrent collection, in addition to considering the validity of the data, you must consider how design the API to deal with the constantly changing nature of the collection. 

When designing a concurrent collection you should follow different guidelines than for a normal collection class.  For example. 

  1. Don’t add an decision procedures.  They lead users down the path to bad code.
  2. Methods which query the object can always fail and the API should reflect this.

Based on that, lets look at a refined design for ThreadSafeList<T>

public sealed class ThreadSafeList<T> {
private List<T> m_list = new List<T>();
private object m_lock = new object();

public void Add(T value) {
lock (m_lock) {
m_list.Add(value);
}
}
public bool TryRemove(T value) {
lock (m_lock) {
return m_list.Remove(value);
}
}
public bool TryGet(int index, out T value) {
lock ( m_lock ) {
if( index < m_list.Count ) {
value = m_list[index];
return true;
}
value = default(T);
return false;
}
}
}

Summary of the changes

  • Both the Contains and Count procedures were removed because they were decision procedures
  • Remove was converted to TryRemove to indicate it’s potential to fail
  • The TryGet property was added and is reflective of the fragile nature of the method.  Sure it’s possible for users to simply ignore the return value and plow on with the invalid value.  However the API is not lulling the user into a false sense of security
  • The collection no longer implements IEnumerable<T>.  IEnumerable<T> is only valid when a collection is not changing under the hood.  There is no way to easily make this guarantee with a collection built this way and hence it was removed. 
  • The indexers were removed as well.  I’m a bit wishy washy in this particular point as there is nothing in the API which gives a user a false sense of security.  But at the same time mutable concurrent collections are dangerous and should be treated with a heightened sense of respect so the indexers were removed.

This version of ThreadSafeList is more resilient, but not immune to, accidental user failure.  The design tends to lead users on the path to better code. 

But is it really usable?  Would you use it in your application?

Comments (52)

  1. rprimrose says:

    This is a good post, but I would avoid using lock though. It is like using a sledge hammer to tap a picture hook into the wall. It is very inefficient.

    Using ReaderWriterLock is better, but ReaderWriterLockSlim is better again if you can use the 3.5 framework. The main issue with lock is that it is an exclusive lock regardless of whether the request is a read or write action.

    The ReaderWriterLock/ReaderWriterLockSlim implementations allow concurrent reads because they don’t change state while writes get appropriately queued for exclusive write access.

  2. jaredpar says:

    @Rory, The last time I read a bit about ReaderWriterLock (non slim version), it said that ReaderWriterLock is actually only more efficient if the number of reads at least doubles the number of writes.  I think that’s probably a reasonable expectation for a list class but it would still be highly usage dependent.

    But this post was definitely not about efficiency, more about usage :)  

  3. Keith Patrick says:

    I generally find that kind of complexity to be not unusable but definitely not worth the limitations. I prefer to keep my data objects (classes of properties and collections of those classes) "stupid" and pass the responsibility for concurrency up the chain of command (either have the owning instance itself non-threadsafe or let it manage all its concurrency – especially if it has multiple collections)

  4. Great post. The truth is, mutable collections are only as thread-safe as the code that uses them. An enumerator will never be thread-safe, the whole if (exists) { remove } scenario will never be either.

    You mention that Microsoft doesn’t ship these classes. Oh but they do. :) Remember the whole .NET 1.0-era convention of using a Synchronized() method that returned an instance of the collection that had locks around its accessors? Then there’s the big SyncRoot blunder.

    So they did go down this path at first and realized it was a lost cause. Immutable lists, while they are certainly attractive, will not be suitable for some of the more performance sensitive scenarios. But they’d be a welcome addition to the framework.

  5. Larry Lard says:

    That a thread-safe collection can’t possibly be enumerable is a nice little packet of insight.

  6. Thank you for submitting this cool story – Trackback from DotNetShoutout

  7. FredV says:

    Sure that should be:

               if( index < m_list.Count ) {

                   value = m_list[index]; //index not 0

                   return true;

               }

    (in the public bool TryGet method)

  8. masklinn says:

    @Josh Einstein & Larry Lard

    > An enumerator will never be thread-safe, the whole if (exists) { remove } scenario will never be either.

    > That a thread-safe collection can’t possibly be enumerable is a nice little packet of insight.

    It depends of what "enumerator" or "enumerable" means to you.

    If it’s solely the Java/C# external enumeration (an externally consumed sequence object extracted from the collection) then yeah.

    If on the other hand it’s the general concept of traversing a sequence/collection and applying an operation (reduction, filtering, transformation, combination, side-effectful operation) at each step then it should be possible to iterate in a thread-safe manner using internal iterators, where you tell the collection/sequence the kind of iteration you want (or call a function that does it) and provide the block of code to run at each iterative step, as in Smalltalk, Ruby or most functional languages.

  9. jaredpar says:

    @FredV, yep that was a typo, updated now.  Thanks for pointing it out!

  10. Jordan says:

    A collection can be made both thread-safe and enumerable by using copy on write semantics.

    The enumerator that the client is given is then a snapshot of the list and is guaranteed not to change.

  11. jaredpar says:

    @Jordan,

    Yes that certainly does the trick.  Essentially you hook into the IEnumerable<T>.GetEnumerator() call and return a snapshot of the list. Then the data contract portion of IEnumerator<T> can be easily implemented.

    But this does create a semantic difference.  Most IEnumerator<T> implementations are completely invalidated when the collection is modified.  The new version won’t be which is fine (it’s in fact much more reliable).  But how do you convey that semantic change to the user?  

    Users who don’t understand the concrete details of the collection class are likely to assume that like any other collection, changing this thread safe collection, will also invalidate the enumerator.  They are likely to take steps to prevent it from occurring.  Astute users will see the insane complexity in getting that correct.  

    Instead of implementing IEnumerable<T>, it think it’s much better to have a method called GetSnapshot() which returns an IEnumerable<T>.  The name is more indicative of the behavior than just implemeting IEnumerable<T> would be.  

  12. Jordan says:

    @Jared

    The problem with only providing a GetSnapshot method is that by not implementing the IEnumerable<T> interface you’re excluding the programmer from participating in any functions that work on IEnumerable<T> objects.

  13. jaredpar says:

    @Jordan

    I disagree.  Because GetSnapshot() provides an IEnumerable<T>, participation is just as possible.  All this method prevents is implicit participation.  I feel this is justified because the implicit behavior is likely to be behavior the user is not expecting.

    GetSnapshot() is (at least hopefully) and unambiguous method and makes it clear to the user that the IEnumerable<T> used won’t ever change.

  14. You’ve been kicked (a good thing) – Trackback from DotNetKicks.com

  15. Jordan says:

    @Jared

    Sorry, I see my mistake. I was taking GetSnapshot() to return an enumerator, not an enumerable.

    Still, I’d personally rather the collection just be directly enumerable. Like Java’s CopyOnWriteArrayList<E>

  16. Ken says:

    "Immutable lists, while they are certainly attractive, will not be suitable for some of the more performance sensitive scenarios."

    Josh, can you explain?  Clojure uses exclusively immutable lists (and hashtables), and it’s no performance slouch, to say the least.  I’ve seen demonstrations of how easy it is with Clojure to write correct code that distributes easily across hundreds of cores.  ISTM that the value of being able to easily write working multicore code will, except in pathological cases, far outweigh the minimal overhead that immutable data structures might have.

  17. Erik Guerrero says:

    It is not hard to do that. However, threading is a topic that isn’t easily explained. I couldn’t develop using threads until I discovered the book "C# 2008 and 2005 threaded programming", by Gaston Hillar, Packt Publishing. The book makes threading and multicore programming easy, with real life examples.

    http://www.packtpub.com/beginners-guide-for-C-sharp-2008-and-2005-threaded-programming

    Most C# books teach threading using Console.Writeline …… Console.Writeline ……. Console.Writeline

    Yeah! I know how to write lines, but when I have to update the UI ,……. I’m dead man!! :( That’s the problem. The aforementioned book saved my life with multithreading! Highly recommended.

  18. terry says:

    Jared to answer your question, no I don’t think I would want to use the ThreadSafeList class in my application. The lack of Contains and Count methods just make the api of the class cumbersome to use for any real world application. I can’t think of hardly any times I’ve used a collection or list when I didn’t need to check the "Count" or see what it "Contains".

  19. wekempf says:

    Oh boy.  Lots of bad info here.

    First, I don’t care for the term "data thread safe".  Whether or not data access is thread safe is the only thing thread safety discusses.  What’s important is WHAT data is protected.  The MSDN and C# XML documentation comments do a good job with this.  Thread safety is described for two aspects: static and instance.  Static means that EXTERNAL data is ensured to be thread safe.  Instance means that INTERNAL data is ensured to be thread safe.  Both assurances are AT THE METHOD LEVEL.  No assurances can be made between method calls.  With out external synchronization mechanisms, the only way to ensure thread safety between method calls is to ensure NO DATA IS MODIFIED.

    Your final "solution" fails.  Using "Try" style methods don’t solve the race conditions that exist here… all they do is translate an exception into a boolean result.  Client code is still left with a situation in which they have a race for which they have no way to "recover".  Tell me how I’m supposed to use TryGet, if I don’t know the length of the collection?  Methods can be made thread safe through the use of synchronization.  Classes can not.  Classes can be made thread safe (at the level your implying) only by being made immutable.  Your ThreadSafeList isn’t immutable, and thus is not thread safe.

    In general, methods should be made thread safe at the static level.  Making them thread safe at the instance level usually just leads to unnecessary overhead, since generally client code is still going to have to synchronize at a higher level.

    @Rory, reader/writer locks are an optimization in only some circumstances.  They’re a pessimization in other circumstances.  Blindly using them instead of a lock is at best a premature optimization.  Also be aware that the semantics of r/w locks are tricky.  Which should be given priority? Readers or writers?  The answer depends on specific needs.

    @masklinn, you’re correct except in regards to your language biases.  C#/Java and other languages can easily do internal iteration (well, if your language doesn’t support lambda’s, the "easily" is debatable, but C# has supported lambdas in some form since 2.0).

    @Jordan, COW is problematic, and results in a certain performance hit.  This concept was abandoned by most std:string implementations for a reason.  Will it do what you say?  Yes. Is it a good idea?  Probably not.

    @Erik Guerrero, anyone who says writing threaded code is easy, doesn’t understand threading (or they are using a functional language… and event then…).

    @terry, you need to broaden your programming experience.  Immutable collections don’t do away with "Count" and "Contains".  String is such a type in the .NET realm.

  20. jaredpar says:

    @wekempf

    The "data thread safe" term is just that a term.  It’s one I used for brevity in the blog post.  

    In your comment you ask "Tell me how I’m supposed to use TryGet, if I don’t know the length of the collection?".  Part of the point of the blog post is to illustrate just how difficult it is to build both a safe and useful collection.  This is just one example. There is a future post coming up (Monday hopefully) which expands on how to make these collections much more usable.  

    I also would like to counter your question with another one. "if you had the length, what exactly would you do with it?"  I contend knowing the length of a mutable thread safe collection is a useless quantity.  

    [wekempf] "Using "Try" style methods don’t solve the race conditions that exist here… all they do is translate an exception into a boolean result."

    I disagree.  Race conditions cause unpredictable behavior that is dependent on the relative ordering of events.  The original code in my blog is clearly a race condition because it is dependent on the background thread not updating the collection during the sequence of operations

    if ( m_col.Count > 0 ) {

     var first = m_col[0];

     …

    }

    This code could have many outcomes due to the many different operations that could occur in between the 2 accesses of the collection.  The second version of the collection makes no assumptions on the relative order of events because there is only one

    if ( m_col.TryGet(0, out first) ) {

     …

    }

    In this code, the timing of the background thread is irrelevant to the correct execution of this algorithm.  The code will get the first element if it exists or won’t.  Hence it’s not a race condition.  

  21. wekempf says:

    Knowing the length isn’t useless.  It’s extremely useful, in fact.  Relying on the length not changing, when the type is mutable, without synchronization, is a mistake.  Any attempts to remove the need for external synchronization are going to fail, so long at the type is mutable.  The best you can do is mess with the granularity of the problem, with schemes like using internal iterators.

    Tell me how I’m supposed to use TryGet?  It requires an index.  That index has a precondition that it be within a valid range.  If I don’t know what that range is, I can’t reliably call this method.  Whether it returns a boolean result or throws an exception is not relevant to the discussion.  What’s relevant is whether or not I can reliably call TryGet, which I can do only if I know the range of valid values.  If I can’t query the object for this info, then only in contrived scenarios can I do so even in a single threaded scenario.  In a multi-threaded scenario, I never can (well, in those same contrived scenarios I could, provided I use an external method of synchronizing, which just means you’ve complicated the interface with no benefit).

    I’m the author of Boost.Threads.  Trust me, I understand what a race condition is.  The example you gave "if (m_col.TryGet(0, out first))" is most certainly a race condition.  By your (correct) definition of a race condition (race conditions cause unpredictable behavior that is dependent on the relative ordering of events), this is a race condition.  The result of calling TryGet is unpredictable, based on the ordering of events.  It will either return true or false, depending on what other events have occurred prior to the call, which you can’t predict.

    Thread A:

    m_col.Add(foo);

    Thread B:

    if (m_col.TryGet(0, out first))

    {

    }

    If you can tell me whether or not we’ll get into the if block, then you win this argument.  If you can’t (and you can’t), then you lose.  This is a race condition.

  22. jaredpar says:

    @wekempf

    [wekempf] Knowing the length isn’t useless.  It’s extremely useful, in fact.  Relying on the length not changing, when the type is mutable, without synchronization, is a mistake.  

    You didn’t answer the question though.  What exactly would you do with the length?  I argue there is nothing useful that can be done with it except perhaps writing it to a log file somewhere.

    [wekempf] Tell me how I’m supposed to use TryGet?  It requires an index.  That index has a precondition that it be within a valid range.

    I think we’re conflicting on our definition of reliability as it relates to this method. You seem to be arguing that reliability means "I can always access item at index X".  

    For me this is not the case.  I would argue that you can never meet that standard with a mutable collection.  For me reliability means that "the method will be able to both determine if there is a value at the index and return that value back to me".  The method meets that standard.

    [wekempf] If I don’t know what that range is, I can’t reliably call this method.

    Even if you know what the range is, you cannot reliably call this method.  The range can change before the call executes and you’re right back where you started.  The range of a mutable collection is useless.  

    [wekempf] The example you gave "if (m_col.TryGet(0, out first))" is most certainly a race condition.

    I understand what you’re saying here but I disagree that it’s a race condition.  Yes the result will change depending on when the TryGet method is entered.  However the method itself will execute correctly in the face of multiple threads.  The original example won’t.  

  23. wekempf says:

    "For me this is not the case.  I would argue that you can never meet that standard with a mutable collection.  For me reliability means that "the method will be able to both determine if there is a value at the index and return that value back to me".  The method meets that standard."

    If that’s your definition, there’s NO need for this wrapper.  List<T> already meets that definition.  I can as easily change your original code you claim to be bad to be this:

    try

    {

      m_col[0];

    }

    catch (Exception)

    {

    }

    You’re changing syntax with out changing the result.  Without external synchronization, you have a race condition.  You CAN NOT CHANGE THIS FACT as long as your class remains mutable.

    "I understand what you’re saying here but I disagree that it’s a race condition."

    Then I suggest you either do a LOT more research on the topic, or stop using threads.  This isn’t an arrogant claim.  What you have is a race condition, and disagreeing with that isn’t going to change reality.  It WILL cause your code to misbehave at some point.  You simply can not reason about the behavior of the code.

    It’s discussions like this that illustrate why the current threading models that "shared state" languages use need to be abandoned.

  24. RussellH says:

    I haven’t yet seen a need for any thread safe collections, at least in ASP.NET programming.  I let IIS or the database handle concurrency.  If I have to start a long running process, I usually do it by submitting a database job and not by starting a new .NET worker thread (or using one from a thread pool).  If I’m bulk updating database, and I know there is very light user activity, I can use parallel query facilities.

    I know there has been a lot of programming effort recently in the area of multi-threading, probably in response to the availability of multi-core processors.  Are people mainly using threading in client GUI applications?  I don’t get it.  It seems theoretical to me.  I guess I’m asking where does this fit into an overall application design?  In a web application, don’t we want to avoid sharing data structures between sessions?

  25. jaredpar says:

    [wekempf] If that’s your definition, there’s NO need for this wrapper.  List<T> already meets that definition.  I can as easily change your original code you claim to be bad to be this: (code snippet)

    No you can’t.  Besides swallowing potentially fatal exceptions, you’re version of the code doesn’t even guarantee that the latest version of the list is being accessed.  List<T> is constantly swapping out the underlying array implementation as it grows and shrinks.  The lack of a lock removes all memory barriers.  So with your example it’s quite possible this method will return an element from a previous version of the list that the background thread already deleted.  

    [wekempf] this isn’t an arrogant claim.  What you have is a race condition, and disagreeing with that isn’t going to change reality.  It WILL cause your code to misbehave at some point.  

    I ask you to point out where my code will misbehave.  My code takes one path when the item is present an different path when it’s not.  The lack of an item does not represent a failure, it just represents a different scenario.  Both of which can be valid to the program.  

    It’s not much different than calling TryGetValue on an arbitrary Dictionary<TKey,TValue> implementation.  If you choose to ignore the dictionary could be empty then yes you’re program will fail.  But I wouldn’t call it a race condition.  

    The point of this post is to think about how to design an API that makes it harder (but not impossible) for people to accidentally fail when using mutable thread safe collections.  IMHO, the TryGet() method is an explicit notification to the user that "hey this method might not give you back a value".  As opposed to a Count property which is essentially giving back the user false information.  

    I argue that eliminating "decision procedures" makes it much harder for a user to get into error states like the original case.  

  26. Great article. With multi-core sparking more concurrecny discussion it is a good thing to really make people aware of the possible pitfalls.

    The only thing I would add to your second, more appropriate implementation, would be to provide an example of how the try methods could be used. A perfect example would be a SQL Server deadlock. Keep trying to access the resource until a specified amount of time has passed and based off of the architecture decide the appropriate approach if the resource is still locked after that amount of time has passed. This shows that even an application as extremely concurrent and mission critical can and will fail and that there never is a true thread safe based application. We can do our best to try but thread safety will never be perfect.

  27. wekempf says:

    *sigh*

    "Besides swallowing potentially fatal exceptions, you’re version of the code doesn’t even guarantee that the latest version of the list is being accessed."

    1.  We’re not swallowing "potentially fatal exceptions".  The ONLY exception that the indexer can throw is ArgumentOutOfRangeException, which isn’t a fatal exception and conveys the same info your boolean does.  If you don’t trust the documentation on this, change the type in the catch, as I was just being brief.

    2.  Your TryGet doesn’t make any more guarantee about using the latest version of the list.

    Actually, there’s a bit of a mistake I made here.  List<T> doesn’t guarantee that the indexer is thread safe.  My comments here were actually meant to apply to the first version of a "thread safe" list that you claimed to be inferior, but I typed the response hastily.  List<T> can never be used without external synchronization.  Both versions of a "thread safe" list that you have here can only sometimes be used without external synchronization, and those sometimes are the same and are few.  That’s why it’s generally not a good idea to create such collections… not because it’s "hard".

    "I ask you to point out where my code will misbehave.  My code takes one path when the item is present an different path when it’s not.  The lack of an item does not represent a failure, it just represents a different scenario.  Both of which can be valid to the program."

    If you can’t tell me which path it will take, it’s a race condition.  Whether that race condition will be somehow "fatal" to the application is wholly dependent on facts not present in this discussion, making it impossible to point out.  However, only in the most trivial of circumstances would this race not be of concern.

    Your example usage uses an index of 0.  If we didn’t pass an index, but instead used a (Try)GetFirst(), then I might be able to agree with you.  We’d still have the potential for a race condition, but so long as your not ignoring the boolean return, the race is not likely to be fatal.  But we’re not talking about getting the first element.  We’re talking about getting an arbitrary element.  There’s not very many scenarios in which you can reason about the results here, even with the boolean return.  You can’t use this for iteration, because the collection can change between calls.  You can’t generally rely on it for getting the element at some known position (how would it be known, BTW?), because we can’t reason about how the collection may have changed. Two consecutive calls could both succeed, but return different objects, for instance.

    This class isn’t safe to use in most situations, with out external synchronization.

    "It’s not much different than calling TryGetValue on an arbitrary Dictionary<TKey,TValue> implementation."

    TryGetValue doesn’t make any guarantee about correctness with regards to multi-thread access.  I can’t reason about the result when there’s multiple threads (actually, I can’t even reason that the dictionary will not be corrupted in this case, since TryGetValue is only thread safe at the static level).  So, you’re right, your TryGet is the same.  It’s pointless when there’s multiple threads.  I can’t reliably reason about the behavior without external synchronization.

  28. wekempf says:

    @Gregory Kornblum, "We can do our best to try but thread safety will never be perfect."

    This is wrong.  There are models in which applications can be proven to be correct, even though they use multiple threads.  The key is to not share state.

  29. jaredpar says:

    @wekempf

    [wekempf] We’re not swallowing "potentially fatal exceptions".  The ONLY exception that the indexer can throw is ArgumentOutOfRangeException,

    That’s incorrect.  It’s only designed to throw ArgumentOutOfRangeException.  It can also throw StackOverflowException and OutOfMemoryException (among others) which are both *technically* fatal

    [wekempf] Your TryGet doesn’t make any more guarantee about using the latest version of the list.

    My example does.  There is no possibility of a pending delete which is already present on one thread’s memory to be possible while I’m accessing the list.  There may be a pending Remove call, but the actual operation hasn’t happened.  Your example cannot make that guarantee

    [wekempf] If you can’t tell me which path it will take, it’s a race condition.

    The fact that you can’t tell the path doesn’t make it a race condition.  I’ll give you a completely single threaded counter example.

    Dictionary<string,object> map = GetSomeDictionary();

    object value;

    if ( map.TryGetValue("foo", out value) ){  

    }

    You can’t know what path this will take but I’d hardly call it a race condition.  

    [wekempf] If we didn’t pass an index, but instead used a (Try)GetFirst(), then I might be able to agree with you.  We’d still have the potential for a race condition, but so long as your not ignoring the boolean return, the race is not likely to be fatal.  But we’re not talking about getting the first element.  We’re talking about getting an arbitrary element.

    This part my is fault.  I usually use this example based on a Stack/Queue vs. a List.  I switched it to a List for the blog and switched TryPeek -> TryGet.  Mainly to drive home the point that indexes and mutable collections are non-trivial operations.  The intention of the blog though was to show accessing the first element.

  30. wekempf says:

    "My example does.  There is no possibility of a pending delete which is already present on one thread’s memory to be possible while I’m accessing the list.  There may be a pending Remove call, but the actual operation hasn’t happened.  Your example cannot make that guarantee"

    Not true.  My example is just as safe as yours.  (Remember, my example is assuming the first version of the thread safe list.)

    "You can’t know what path this will take but I’d hardly call it a race condition."

    Of course it’s not a race condition… there’s a single thread.  I can reason about the behavior, even if I can’t know the outcome.  That’s a subtle but important distinction.  For instance, I can reason that calling TryGetValue two times in a row will return the same result.  I can reason that if TryGetValue return false, I can safely Add an entry with "foo" as the key.  I can go on and on about things I can reason about here.  In your case, you can ONLY reason that at the time you made the call, either there was an entry or there wasn’t.  Except in the most trivial of situations (Pop() on a stack would be such a situation) is that enough.  As long as TryGet takes an index, it’s not possible to reason about the results in MOST situations, and thus I have to externally synchronize.

    If you meant to show a Stack or Queue, then you should have… but in that case you’d have been better off going with a lock free implementation ;).

  31. jaredpar says:

    @wekempf

    [wekempf] For instance, I can reason that calling TryGetValue two times in a row will return the same result.  I can reason that if TryGetValue return false, I can safely Add an entry with "foo" as the key.  I can go on and on about things I can reason about here.  In your case, you can ONLY reason that at the time you made the call, either there was an entry or there wasn’t

    Exactly.  The point of the blog post was that you must minimize the API’s which allow you to make decisions on the state of the collection.  I refer to them as "decision properties" for lack of a better word/phrase/term.  This is why I removed Count, Contains, etc… because they make it all to easy to fall into the trap of reasoning about an un-reasonable object.  

    You can definitely make the mistake of using TryGet to incorrectly reason about a future TryGet.  Exmaple: "I TryGet’d 1 so TryGet(0) must work".  But I’d argue that having functions where the volatility is called out in the name makes it less likely that people will make these types of reasoning.  More importantly, it won’t take them by surprise when it fails.  This is speculative though.  

    [wekempf] If you meant to show a Stack or Queue, then you should have…

    I definitely meant to use a List this time.  I just more commonly use a Stack and should have set a different bar for the sample.  

    [wekempf] but in that case you’d have been better off going with a lock free implementation ;).

    I prefer immutable 😉  Reason about them all you want.  

  32. RussellH says:

    @wekempf

    "It’s discussions like this that illustrate why the current threading models that "shared state" languages use need to be abandoned."

    Are you including languages like C and C++?  But haven’t people been writing very effective multi-threaded code in these languages for many years?  IIS or Oracle for example.

    I wish I could understand the motivation for doing multi-threading in .NET applications.  I don’t know if many developers will need to worry about it, multi-core processors notwithstanding.  Can anyone please give me an example where you would want to use a shared collection like a list in a real world .NET application?  Is it for client code?

  33. wekempf says:

    "You can definitely make the mistake of using TryGet to incorrectly reason about a future TryGet.  Exmaple: "I TryGet’d 1 so TryGet(0) must work".  But I’d argue that having functions where the volatility is called out in the name makes it less likely that people will make these types of reasoning.  More importantly, it won’t take them by surprise when it fails.  This is speculative though."

    Then by including the index in the signature of TryGet you’ve accidentally exposed a "decision property".  Give me ANY scenario in which you could call this function with anything other than 0, and effectively reason about the results.

    "I prefer immutable 😉  Reason about them all you want."

    Then why did you bother with a mutable list here?

  34. wekempf says:

    @RusselH, yes, I’m including languages like C and C++ (I wrote the Boost.Threads library for C++).  Yes, people have been writing effective MT code in these languages for years.  They’ve also been screwing up the entire time.  Even the experts screw up, and I mean people with deep knowledge of the concept that goes well beyond that of 99% of the developer community.  I can give you a LOT of examples of this.  Worse, a LOT of people THINK they understand the concept, because really, the APIs aren’t so tough, but they make mistakes like this one.  Or worse.  Much worse.  There’s a reason why there’s a big push in the industry right now to find better ways to do MT… the current shared state models are too complex, and in general are not possible to prove correct.  The fact that so much MT code is running, and apparently with out error, should actually scare anyone that understands the shared state threading model. :)

  35. jaredpar says:

    @wekempf

    [wekempf] Then by including the index in the signature of TryGet you’ve accidentally exposed a "decision property".

    I can see that argument.

    [wekempf] Then why did you bother with a mutable list here?

    Becaues customers use them.  The post is meant as an education post to the dangers of "supposedly" thread safe mutable collections.  And a *first* step towards designing API’s that make mutable collections more managable.  

    I’d love it if the world only used immutable collections for multi-threaded scenarios but the reality is people don’t.  There’s a good portion of the population who just want mutable thread safe collections (and several who have good reasons).  

    The next post includes a much more usable design for mutable collections.  

  36. terry says:

    @wekempf

    "Immutable collections don’t do away with "Count" and "Contains".  String is such a type in the .NET realm."

    I never said that it did. My comment was specifically about the final ThreadList implementation which *did* do away with the two methods I mentioned leaving it to the client code to figure it out somehow. Also, according to your own words and the code the ThreadList isn’t immutable so you can’t really compare it to the String class.

    "Your ThreadSafeList isn’t immutable, and thus is not thread safe."

  37. As some people have already pointed out, I don’t thin the new proposal of Thread safe collection makes any difference. You’re just getting a bool instead of an exception.. so what?

    I’m with Keith who made one of the first comments in this post: The best you can do is just to have an external lock. Because the user of the list will be the one who really now how to make it thread safe.

    So you want to make a thread-safe list easier to do? Okey then, just create a new kind of list which is like the first one but shares the internal lock object with the user, so that he doesn’t need to create an external one. What about that?

  38. Uh oh of course you can’t share exactly the same lock you use internally in the list with the user, otherwise if the user captures the lock because he is going to use the list, he can’t use the list because it’s locked. What I meant is that the list could come with *another* lock accessible by a getter like lock() so that the user itself doesn’t need to create it.

  39. jaredpar says:

    @Eduardo

    [Eduardo] I don’t thin the new proposal of Thread safe collection makes any difference. You’re just getting a bool instead of an exception.. so what?

    The difference is in creating an API by which it makes it harder for the user to shoot themselves in the foot.  Having a count and an indexer is just asking the user to make an incorrect assumption about the state of the list.  My argument is that eliminating procedures/properties like this make it harder for the user to get into a bad situation.  

    Note harder, not impossible.  Short of immutable collections it’s not possible to be completely safe.

    [Eduardo] Okey then, just create a new kind of list which is like the first one but shares the internal lock object with the user

    My next post goes into the ups and downs of this approach  

  40. Erik Guerrero says:

    @wekempf: You say "anyone who says writing threaded code is easy, doesn’t understand threading (or they are using a functional language… and event then…)"

    You are right, threading is difficult reading horrible threading books that just explain to write lines to console. I agree with you.

    I was trying to exploit multicore CPUs for many months now. It is really hard.

    However, if you follow my advice and read the code samples (if you don’t want to buy a book), you can download its code. You’ll see that this book is completely different from any other book about threading. That’s why I do recommend it. Because, it made is possible for me (an average C# developer, not an expert), to understand threading!!! I must say that’s nearly a miracle.

    You can enter here: http://www.packtpub.com/support

    Download the code and tell me if I am wrong :) You’ll see, the examples are amazing.

    Neither the author nor the publisher pay me to promote "C# 2008 and 2005 threaded programming".

    Cheers,

    Erik Guerrero

  41. In my last post we discussed the problems with designing a safer API for mutable thread safe collections

  42. What I meant was that general purpose collections cannot be automatically made thread-safe, even with snapshots. The underlying collection could be sorted, items removed, etc and in that case you’re working on invalid data. Not to mention that if you then wanted to modify that snapshot, you can’t.

    Of course special-purpose collections can be designed to guarantee thread-safety as long as certain assumptions can be made about its use.

    For example, a thread-safe queue is pretty simple because you can assume that it’s being read in a forward-only fashion and once an item is read, it’s gone.

    But that’s a far cry from trying to make ArrayList "thread-safe" by calling a Synchronized() method which was my point earlier.

  43. @Erik Guerrero

    I just purchased the book…very good resource so far.

    @Wekempf

    I agree with a couple of your comments that if you think multithreading is easy, you don’t understand multithreading.  Furthermore, I agree if you want true collections that are thread-safe use a funcional language like F#.

    I think that there are some abstraction patterns (like the callback, backgroundthreadinvoker) make beginner multithreading easy (equivelant of drag and drop programming).  I am by no means an expert…I do understand a bit more than that, but there is a lot more to go.  I do think the Parallel Task library will help a lot of us out in .NET 4.0 and allow you to add parallelism in functional areas in your code (i.e. LINQ)

  44. Josh Cauble says:

    Not sure if any of you have seen this or not but the book <a href="http://www.amazon.com/Concurrent-Programming-Windows-Microsoft-Development/dp/032143482X/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1234991145&sr=8-1">Concurrent Programming on Windows</a> by Joe Duffy has great information on threading.  He works on the Threading team at MS and gets into some really great stuff.

    I’d highly recommend it to anybody and I believe it should be on every developers workshelf.  I already know that there are several companies where I live that will not hire you unless you have read this book and understand the threading concepts therein.  

    As to the primary discussion, threading is a pain, we have a app that runs over 70 threads and there is a lot of concurrency issues and race conditions.  I inherited this code base and with some locks placed in the proper places we were able to solve just about all of our problems and believe it or not ALL our apps data (almost 600MB) is loaded into memory / hashtables instead of going to the db to get it.  Don’t ask.  So I know exactly what you are talking about. In your example above we would have still run into a race condition in our code base.  We have had to essentially implement the Snapshot style to ensure we get something we expect.  However we have had to wrap our Getter to return null if the key no longer exists.  Our primary code then has to check for null.  It’s a royal pain.

    I’m reading this book now so I can see about getting something better worked out.  

  45. Rohith Rajan says:

    in order to make a class thread safe,i think we have to make the lock object static and methods too.Did u forget to put static ?.Otherwise other thread will get a different lock object. Please correct me if i am wrong.

  46. SharePoint SharePoint Wiki: Create pages based on a template Archiving and intercepting large files with

  47. Greg says:

    Any code that calls the thread safe class can cause undesirable effects (e.g., contention, overly long blocking, etc).

    The solution is to make the C#/C++/Framework classes thread save regardless of how the user intends to use them.  This is a known performance penalty and is acceptable.  This argument can follow the same line as why should one use dynamically allocated memory versus the much faster array.

  48. H. Saleem says:

    I think what ‘wekempf’ is trying to say that you need to take in prospective the thread safe at the higher call level, consider this example that you are processing a list of files, by two ore more threads, the locking has to be at higher level in the call (ex:the thread procedure)

    if you call TryGet(… ) with out synchronizing at the thread procedure level you haven’t adhered to ACID, because simply thread B can go in and call TryGet() and might get the same File, thus synchronization needs to be at higher level

    more like the following:

    void ThreadProcedure

    {

    Lock(this)

    {

     obj file = ThreadSafeCollection.TryGet(…);

     ThreadSafeCollection.Remove(file);

    }

    // do stuff with file…

    }

    as you can see, the locking at the collection level ThreadSafe becomes obsolete and irrelevant

  49. bill says:

    Why do you lock on m_lock instead of m_list directly?

  50. jaredpar says:

    @bill  

    In general I like to have a completely separate object whose purpose is to lock on.  It makes your code more robust because it reduces the chances you will accidentally leak an object reference outside a class where it can be arbitrarily locked by another component.  

    In this small example it’s not actually necessary to do so but it’s still a good practice.  

  51. supercat says:

    If the index for TryGet() is an ordinal 32-bit integer, it’s unclear what semantics it could have that wouldn’t be rendered meaningless by the possibility of items being deleted asynchronously.  On the other hand, it’s not uncommon to have a situation where data may be put into a collection by multiple threads and yet only removed by one; in such a situation, the one thread which is removing items might be able to safely use "decision" procedures if it alone would be able to invalidate the result (e.g. if the thread determines that list.Count is 20, it may be able to safely process the first 20 items).

    Further, I’m not quite clear why it wouldn’t be useful to have an iEnumerable object which would allow insertions or deletions, with the promises that:

    (1) Every item that existed throughout the lifetime of the enumeration, without change to the key, would be included exactly once.

    (2) No item would be included in the enumeration that did not exist at any time within its lifetime.

    (3) Items that exist for part of the enumerator’s lifetime may be included zero or one times.  For purposes of this rule, changing an item’s key would be regarded as creating a new item and deleting the old one, so an existing enumeration may legitimately return the old value, new value, both, or neither.

    (3) All items that are returned will be given in a valid sequence, though not necessarily the same sequence as would have occurred in the absence of insertions and deletions.

    (4) It should be possible to determine, e.g. using a 64-bit update counter, whether anything has happened to a collection since a time just before an enumeration began.

    Some data structures would have difficulty dealing with such constraints, but others would not.  Such behavior would seem more useful than always invalidating enumerators on any list change.