High CPU in .NET app using a static Generic.Dictionary


A couple of weeks ago I helped out on a high CPU issue in an ASP.NET application.

Problem description

Every so often they started seeing very slow response times and in some cases the app didn’t respond at all and at the same time the w3wp.exe process was sitting at very high CPU usage 80-90%.  This started happening under high load, and to get the application to start responding again they needed to restart IIS.

Debugging the problem

They gathered a few memory dumps during the high CPU situation for us to review and when running the sos.dll command ~* e !clrstack (in windbg) to see what all the threads were doing we found that they were all stuck in callstacks similar to this one:

OS Thread Id: 0x27dc (124) 
  ESP       EIP     
2f77ed24 795b3c5c System.Collections.Generic.Dictionary`2[[System.Int32, mscorlib],[System.__Canon, mscorlib]].FindEntry(Int32) 2f77ed3c 795b3835 System.Collections.Generic.Dictionary`2[[System.Int32, mscorlib],[System.__Canon, mscorlib]].ContainsKey(Int32) 2f77ed40 209f1932 MyComponent.Settings.get_Current() ... SOME STACK FRAMES REMOVED AS THEY ARE NOT IMPORTANT FOR THIS ISSUE ... 2f77f0a4 209f7545 ASP.MyApp_default_aspx.ProcessRequest(System.Web.HttpContext) 2f77f0a8 65fe6bfb System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() 2f77f0dc 65fe3f51 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef) 2f77f11c 65fe7733 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception) 2f77f16c 65fccbfe System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object) 2f77f188 65fd19c5 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest) 2f77f1bc 65fd16b2 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest) 2f77f1c8 65fcfa6d System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32) 2f77f3d8 79f047fd [ContextTransitionFrame: 2f77f3d8] 2f77f40c 79f047fd [GCFrame: 2f77f40c] 2f77f568 79f047fd [ComMethodFrame: 2f77f568]

In other words, the method MyComponent.Settings.get_Current() was calling ContainsKey on a Generic.Dictionary object and for some reason it was getting stuck when trying to find the entry.

Looking at the MyComponent.Settings.get_Current() method, we found that the Generic.Dictionary it was calling ContainsKey on was a static dictionary and that all threads were working on the same dictionary.

The MSDN documentation about Generic.Dictionary has the following information about the thread safety of Dictionary objects 

A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

What is happening here, and causing the high CPU is that the FindEntry method walks through the dictionary, trying to find the key.  If multiple threads are doing this at the same time, especially if the dictionary is modified in the meantime you may end up in an infinite loop in FindEntry causing the high CPU behavior and the process may hang.

Resolution:

These type of timing issues with static collections are fairly common in ASP.NET apps with high load.

To resolve this timing issue you should take special care to synchronize (lock) around access to the dictionary if there is a possibility that you may have multiple writers working at the same time or if there is a possibility that you write while someone else is reading/enumerating through the same dictionary. 

In general, I would recommend to always read the thread safety information carefully when using static collections as many of them require that you implement synchronization on concurrent read/write operations, to avoid this type of issue or issues with for example HashTables where you may get exceptions like InvalidOperationException “Load Factor Too High”.

Have a good one,

Tess

Comments (29)

  1. Is it even a good idea to have a static dictionary?  Depending on what it is (and whether it’s huge and memory concerns would become an issue) would it not be better to replace the static dictionary with a expirable cached one instead?

  2. Alessandro says:

    Ahh! Quite interesting diagnosis and resolution. I enjoyed reading this post.

    Thanks!

  3. Tess says:

    Neil,  could be, but even so, you could run into the same issue since the problem occurrs when you access the same dict. on multiple threads at the same time.

  4. Davy Brion says:

    Hi Tess,

    interesting post, but i’m hoping you can also shed some light on this:

    suppose you have a dictionary with some metadata that is initialized at application startup, and then never, ever modified.  can you still run into problems when a high number of threads are iterating over the dictionary concurrently?  I would assume that this would be safe, provided that there are indeed no modifications to the dictionary, but i’m still not sure…

  5. Very good post. So far I have not know that objects will have thread safe problems. Thanks mam. I will be careful while implementing collections.

    Thanks,

    Thani

  6. Jack says:

    Compared with c/c++, .NET give us a lot convenience

  7. @Tess yes, my mistake – I actually meant to totally separate the dictionary instances and not share them.  It only really works if they’re quite small though, I suppose.

  8. hi Tess,

    Nice post.

    Quite Interesting to read such a post. Keep writing on such issues.

  9. Sush says:

    Great post. Keep up the good work. All the best…

  10. hype8912 says:

    Thank you very much Tess. Lesson learned for me. I’ve used static dictionaries and single instanced dictionaries before in .Net apps for properties instead of creating numerous place holder field variables. I guess this was a mistake and will now know but I usually don’t do this anymore with the new .Net 3.5 way of making auto properties.

  11. Gal Ratner says:

    I use a static Dictionary all the time and always lock it with ReaderWriterLockSlim(). It’s simply a great cache.

  12. Tess says:

    Davy,  that depends on how you iterate over it.  Using an enumerator to iterate over it is not a thread safe operation so if you use that you would still have to lock, otherwise the enumerators would just go back and forth.

  13. Adam Barnett says:

    We ran into exactly the same problem with a high load wcf service that had the same symptoms wish I had know more about windbg at the time had to use remote debugging.

  14. Petter says:

    Hi Tess,

    "Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration."

    Is seems that these sentences confuse others, too:

    http://stackoverflow.com/questions/511205/net-dictionary-is-only-enumerating-thread-safe

    It is still unclear for me: Is it safe to enumerate from multiple threads if it is guaranteed that no writes happen at the same time? If yes, why?

  15. Tess says:

    Petter,

    Don’t take my word for it, but my incling is to say that it’s not threadsafe as the current item in the enumerator may then be changed by multiple threads as you iterate through it.

  16. Adam says:

    If we have separate Enumerator in each thread, we will have separate current item so current items for each enumerator will be independent.

  17. Amir says:

    Hi Tess,

    I noticed a similar problem, I found from the dump call stuck  + logs that a thread was stuck on FindEntry of Dictionary. BUT there is no enumeration in the code and the dictionary is locked before any access to it. Do you think it’s the same problem?

    Thanks, Amir

  18. Tess says:

    could very well still be the same

  19. Nick Duane says:

    It would be nice to know just how a writer is able to cause a reader enumerating through the collection to enter an infinite loop.  I don't doubt that it's possible, but I would think it's also possible for IDictionary.Add() to be written in such a manner as to eliminate any chance of an enumerator ending up in an infinite loop.

    Thanks,

    Nick

  20. Damiox says:

    Hello Tess… I'm analyzing a similar case since we messed up with a static Dictionary and we are experiencing peaks of 100% CPU on a windows service application (C# .NET 3.5 server 64 bits)… my questions are:

    1) Is this always with Static Dictionaries? What about non-Static Dictionaries?

    2) I followed your guidelines but I found out that from 150 threads just 1 thread looks like the ones from your analysis:

    3) My module has several projects, I'm seeing from the Call Stacks several functions pointing to one of these assemblies, but in the function name I'm seeing "Unknown", what do you recommend to be able to see the real function name? I have the .pdbs, should I export them in windbg?

    I also have used the "Process Explorer" application from sysinternals when this problem happened, and I saw that not all the threads but like 20/30 were using about 8% of CPU… Problem is that I don't understand why suddenly my application gets screwed up, and if this is the problem why just one thread keeps consuming all the CPU instead of having the CPU usage distributed around several threads…

    What I have seen from windbg is that the thread #49 (from a total of 150 threads) the following:

    OS Thread Id: 0x1994 (49)

    Child-SP         RetAddr          Call Site

    000000001e44ef18 000006428019c870 System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].get_Count()

    000000001e44ef20 000006428019d1bd Amr.Threading.dll!Unknown

    000000001e44ef50 00000642782f173b Amr.Threading.dll!Unknown

    000000001e44f010 000006427838959d System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)

    000000001e44f060 000006427f602672 System.Threading.ThreadHelper.ThreadStart()

    I took another dmp after having the process in break-mode (in windbg) by 1 hour or so, and once I go on with windbg I saw several "CLR exception first chance" and then I took the dmp and I analyzed it in windbg and the same thread (#40) looked fine (no dictionary method) and all the other threads looked to be working fine… Weird.

    Do you think I might have the same issue that you mentioned?

    What about my comment with what I saw in Process Explorer?

    Your blog looks very nice!

    Regards & thanks in advance

    Damiox

  21. Jeroen Mostert says:

    @Nick: Dictionary<TKey, TValue> works by maintaining an array of buckets (containing all values that hash to the same key), which are themselves linked lists implemented by arrays. The code for looking up an item goes like this (paraphrased):

     int bucketHash = this.comparer.GetHashCode(key) & 2147483647;

     for (int i = this.buckets[bucketHash % this.buckets.Length]; i >= 0; i = this.entries[i].next) { … }

    The problem is that "i = this.entries[i].next" can get caught in a cycle if the dictionary is updated while we're looking up and we're working from a stale entry — linked lists are inherently not thread-safe.

    Note that all this talk about enumerators is a bit of a red herring — just *reading a single entry* from the dictionary can get caught in a loop if the dictionary is updated! In fact, this is exactly what happened in the case Tess mentions. Dictionary enumerations in fact cannot end up in an infinite loop, because they're strictly bounded to the size of the dictionary and they simply skip over empty entries. In addition, enumerators check if the dictionary was modified between enumerations and throw an InvalidOperationException if it is. This check is not foolproof, however (it's possible the dictionary is modified right after the check is made), and if it fails the enumeration can return garbage. So even there you're not safe.

    Please note that all of the above isn't documented and you cannot depend on particular behavior. What is documented is that Dictionary is not thread-safe and if you don't synchronize any situation other than concurrent readers, bad things will happen.

    As for rewriting things: sure, it's possible. It's just not worth it. Thread-safe code is hard to write and typically loses performance in the common case. Thinking about how to prevent non thread-safe code from ending up in a problematic situation is exactly as difficult as writing a lock-free, thread-safe container.

    Note that "fixing" this without making it fully thread-safe means you end up with code that doesn't get caught in an infinite loop (good) but is still not guaranteed to retrieve the correct values (bad). In a sense, that's even worse because things are now silently corrupt. A hang is really conspicuous (as is an InvalidOperationException).

  22. Jeroen Mostert says:

    Wow, replying to a *really* old post. For some reason this entry came up in my RSS reader as updated and I failed to check the details (even though I know Tess has moved on from ASP.NET)…

    Well, at least I hope my post can be of some use to future readers.

  23. Jeff Smith says:

    Thanks for blogging this Tess, it saved me a TON of time.

    I had a vendor supplied web service application which was hogging 100% CPU randomly until you reset IIS. The vendor had me reinstalling IIS, ASP.NET, the OS, and checking all kinds of random stuff (they're team had never used WinDbg or crash dumps). They refused to believe it had anything to do with their app.

    I used WinDbg while it was hanging and found a bunch of threads stuck in Dictionary<>.FindEntry(). I poked around with Reflector and found the vendor's code was in fact calling TryGetValue on a static Dictionary<> just like you described above.

    I sent all this the info back to the vendor and am waiting on a fix now. No way we would have ever gotten this fixed without this info.

    Thanks again!

  24. Amit Mundra says:

    Nice article, Helpful to take the decision what should be used and which used be used.

    Thanks for this.

  25. Gonzalo says:

    Great!

    Exact problem I was having under high concurrency scenario!

  26. Luke says:

    I think reading/writing a dict at the same time will throw an exception

  27. fangw says:

    I also met CPU 100%, now let me confused is unable to restore the scene, what would you recommend?

  28. Matt says:

    Had just hit this same issue and wanted to say thanks!

  29. anchaljoshi says:

    nice,,i also had the same issue,,but now its gone. Thanks for sharing

Skip to main content