Hashtable and Dictionary thread safety considerations

Let’s start with the basics:

  • System.Collections.Hashtable is multiple-reader, single-writer threadsafe.
  • System.Collections.Generic.Dictionary<K,V> has no thread safety guarantees.

Consider a class that has a Dictionary member, and assume that multiple threads may access it simultaneously. Taking a lock is expensive, so what do you think about the following attempt to defer taking a lock?

class ElementManager {

    private Dictionary<String, Element> cache =
                                               new Dictionary<String, Element>();
    private Object o = new Object();

    // Not to ruin the punchline, but I don't want someone to copy
    // and paste this flawed code...
    // WARNING: BAD CODE! DON’T USE!

    public Element GetElement(String name) {
        Element theElement;
        bool result = cache.TryGetValue(name, out theElement);
        if (!result) {
            lock (o) {
                result = cache.TryGetValue(name, out theElement);
                if (!result) {
                    theElement = new Element();
                    cache.Add(name, theElement);
                }
            }
        }
        return theElement;
    }
}

GetElement only takes the lock if TryGetValue found no result. So from a performance perspective, it’s a nice try. But it’s not correct: the value returned may not actually correspond to the name. (That’s not just because value may be stale data; value may have never corresponded to name).

The problem occurs if a reader thread outside of the locked region gets an element while a writer thread adds an element and causes a resize in the Dictionary to occur. If the timing is just right (actually, just wrong) then the reader thread may index incorrectly into the Dictionary (internal implementation details) and return the wrong element.

This thread-safety difference is a little-known twist in our guidance to convert to using generic collections. If you’re converting from Hashtable to generic Dictionary, you’ll want to consider whether you may have taken a dependency on Hashtable’s MR-SW thread safety without knowing it. (This isn’t very common, but it has happened)

IMO, better concurrency support is one of the most compelling feature requests for the collections space. This is important for both performance and usability.

The usability argument may not be obvious, so here’s some background. Our non-generic collections provided synchronized wrappers, which inserted a lock around operations for thread safety. Synchronized wrappers weren’t included with the generic collections, with the argument that one often needs to synchronize at a higher level, over multiple operations (e.g. check if a stack is non-empty and then pop). However, we didn’t provide alternate framework solutions, meaning that users have to implement their own locking for generic collections. This is unnecessarily tedious in the common scenario when you don’t need to synchronize over multiple operations.

This doesn’t mean we should add back the synchronized wrappers; there are options with much better performance, but more on that later.