Psychic debugging: Why your expensive four-processor machine is ignoring three of its processors


On one of our internal mailing lists, someone was wondering why their expensive four-processor computer appeared to be using only one of its processors. From Task Manager's performance tab, the chart showed that the first processor was doing all the work and the other three processors were sitting idle. Using Task Manager to set each process's processor affinity to use all four processors made the computer run much faster, of course. What happened that messed up all the processor affinities?

At this point, I invoked my psychic powers. Perhaps you can too.

First hint: My psychic powers successfully predicted that Explorer also had its processor affinity set to use only the first processor.

Second hint: Processor affinity is inherited by child processes.

Here was my psychic prediction:

My psychic powers tell me that

  1. Explorer has had its thread affinity set to 1 proc....
  2. because you previewed an MPG file...
  3. whose decoder calls SetProcessAffinityMask in its DLL_PROCESS_ATTACH...
  4. because the author of the decoder couldn't fix his multiproc bugs...
  5. and therefore set the process thread affinity to 1 to "fix" the bugs.

Although my first psychic prediction was correct, the others were wide of the mark, though they were on the right track and successfully guided further investigation to uncover the culprit.

The real problem was that there was a third party shell extension whose authors presumably weren't able to fix their multi-processor bugs, so they decided to mask them by calling the SetProcessAffinityMask function to lock the current process (Explorer) to a single processor. Woo-hoo, we fixed all our multi-processor bugs at one fell swoop! Let's all go out and celebrate!

Since processor affinity is inherited, this caused every program launched by Explorer to use only one of the four available processors.

(Yes, the vendor of the offending shell extension has been contacted, and they claim that the problem has been fixed in more recent versions of the software.)

Comments (57)
  1. Reuben Harris says:

    That’s absolutely hysterical!

    I’m glad someone is logging this stuff… MS tech support increasingly looks like one of the most thankless jobs in the world.

    Sysinternals or someone should make a sort of culprit-identifying tool to detect and display the [module]name of the last caller to each of the global/process settings APIs….

  2. Dmitry Shaporenkov says:

    Nice! It’s absolutely the best tech story I’ve seen in last weeks.

  3. Ben Hutchings says:

    Maybe processor affinity should be added to the list of inherited properties at http://msdn.microsoft.com/library/en-us/dllproc/base/child_processes.asp (along with anything else that’s not listed there).

  4. Dave says:

    SetProcessAffinityMask requires admin privileges (well, PROCESS_SET_INFORMATION rights) or it will fail. This is only an issue if people are viewing videos or using poorly written shell extensions while running as admin.

    Oh wait, just about ALL of us do that, since so many programs misbehave when you try to run them in a non-godlike fashion.

    Well thank goodness that Microsoft has never advocated this kind of chicanery. Uh …

    http://support.microsoft.com/kb/178650/EN-US/

  5. Raymond Chen says:

    Notice that the chicanery is listed as the third choice workaround, not the two actual resolutions. In other words, it’s only after the customer rejects the two "correct" fixes.

  6. I thought processor affinity was only a hint; there was nothing to stop the thread/process from being moved.

    Didnt know about inheritance, thats why MSDN has to be so thorough; it has to compete with a platform where you can look at the source at debug time.

  7. Why does processor affinity inherit?

  8. Raymond Chen says:

    I thought the reason processor affinity inherited was obvious – think about it. But apparently it’s not obvious enough. I’ll add it to the list of future topics.

    Steve Loughran: There are two types of processor affinity, "soft" (just a suggestion) and "hard" (absolute requirement). SetThreadAffinityMask sets hard affinity; SetThreadIdealProcessor sets soft affinity.

  9. If Explorer ran each window in a new process by default, instead of having that choice hidden in the options, think kind of thing wouldn’t happen because that inherited affinity would "go away" as soon as you closed that window… right?

    It would also be a lot more stable and reliable in general. Why isn’t that the default… is there some reason for running Explorer as a single process that I’m unable to figure out (my own psychic powers failing me here) or is it just institutional inertia?

  10. Ben Hutchings says:

    Raymond, there’s a problem with the Knowledge Base stylesheet (http://support.microsoft.com/common/css/default/xmlContent.css)? It has a rule:

    .kb div pre

    {

    // various other properties omitted

    white-space: normal;

    }

    that causes sample code to be flowed in a single paragraph. IE seems to ignore the rule so it doesn’t show the problem. There was a similar problem in MSDN a while back. Can you pass the information on or let me know who I can contact about this? I don’t see an appropriate contact address on the web site.

  11. John Topley says:

    "is there some reason for running Explorer as a single process that I’m unable to figure out"

    Performance, I should imagine.

  12. Chris Lundie says:

    I used to have a problem where every time I previewed an AVI file in Explorer, it would never release the handle, so it became difficult to change or delete AVI files. Turned out to be a bug in RealPlayer, as far as I can tell.

  13. "If Explorer ran each window in a new process by default, instead of having that choice hidden in the options, think kind of thing wouldn’t happen because that inherited affinity would "go away" as soon as you closed that window… right?"

    Already available, just turned off by default.

    Open an Explorer window, Tools menu, Options, go to the View tab, scroll down, and check "Launch folder windows in a new process."

    "It would also be a lot more stable and reliable in general."

    And much, much slower due to the massive number of context switches that result — which is why it’s off by default. (Plus app compat reasons.) But it’s there.

  14. l says:

    If you pass the message along about that stylesheet that seems to deliberately break browsers that aren’t IE, could you please have whoever wrote it come back here and explain exactly why they did this?

    I mean, for the Opera ‘margin:-20px;’ thing (Google for ‘bork’) there was an explanation that sort-of made sense (it being a side-effect of a very ugly hack for a very different problem); but why anyone would want to set a PRE tag to … well, basically do the opposite of what a PRE tag is supposed to do? Even when that command is ignored in the only browser they apparently use?

    I know this is off topic; it’s just that this kind of idiocy annoys me to no end every time I need something from MSDN.

  15. Raymond Chen says:

    I’m trying to find the owner of that stylesheet. I suspect the reason is that it’s a simple oversight.

  16. David Walker says:

    In SQL 2000, using Data Transformation Services packages with ActiveX data transforms (now we’re getting technical), all the steps in the DTS package have to be set to run on the main thread if there are any package event handlers.

    From http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp:

    If package event handlers coded in Visual Basic are being used, the ExecuteInMainThread property must be set TRUE. Visual Basic does not support free threading, which DTS uses.

    Someone else’s description of the reason: "Visual Basic is an apartment-threaded application and DTS is free-threaded and the two can conflict to produce the error. This is simply overcome by setting each step’s ExecuteInMainThread property to true."

    I realize that threading models and bugs are not the same thing, but there are cases where we’re told by Microsoft to run everything in an application on the same thread.

  17. David Walker says:

    Also, I know that threads and processor affinity aren’t the same thing either, but I’ll bet that thread errors show up much more wheh you’re on a multi-processor machine.

  18. Raymond Chen says:

    Microsoft is not a single entity. There’s the COM group, the shell group, the VB group… It’s entirely possible that advise you get from the VB group conflicts with advice you get from the COM group. In the same way you might decide not to follow the rules set down by your brother. When you say that "Microsoft" told you to do something, you need to be more specific which group was doing the talking.

  19. foxyshadis says:

    For any given browser, there are always at least 3 CSS solutions to any problem, one of which is a proprietary extension. (That includes our beloved FF.) Two of them will have unintended consequences in other browsers and require further workarounds, or giving up the feature, or simply living with/ignoring effects in other browsers. CSS compatibility is still nasty even across the so-called "fully compliant" browsers (Gecko, Opera, Safari). So don’t be too hard on the guy who did it.

    I won’t even get into how many times I’ve become angry at "Microsoft" only to discover that it was a bad shell extension some app installed that trashed my system. Ones that won’t release file hooks are infurating; I just wish there was an easier way to manage them than the registry. (They’re a lot like browser plugins.) Maybe there is and I just don’t know it.

  20. For separate windows you can consider separate processes. For a single window, it’s very very (very!) hard to correctly synchronize message queues so many/most shell extensions just run in the same process.

    This is a path that we’re pursuing anyways just for better fault isolation but it’s not an easy fix. I haven’t thought about it enough to make a claim like "therefore the windowing system is fundamentally broken" so feel free to make such a claim after thoughtfully considering the same problem on X windows, NeWS, etc. AFAIK, there’s no support for a single window hierarchy that spans clients in X; only the WM aggregates windows from multiple clients. But it’s been almost 11 years since I’ve touched X…

  21. Norman Diamond says:

    Ryan Myers replied to Peter da Silva:

    >> "If Explorer ran each window in a new

    >> process by default, instead of having that

    >> choice hidden in the options,

    >

    > Already available, just turned off by

    > default.

    Both wrong. There is a choice hidden in the options (as both said) but it doesn’t run each window in a new process, it only runs some windows in one separate process from some other windows.

    Remember in Windows 95/98, when an option was added to Internet Explorer to browse in a separate process, the frequency of complete hangs/crashes/etc. of the entire Windows system dropped by half? Maybe at some level this was considered a reduction in performance, but overall it was an improvement.

    Windows Explorer really needs the same thing, an option to put every invocation in a separate process.

  22. Ben Hutchings says:

    Michael Grier: It’s entirely possible for one X client to "swallow" another as a sub-window. For example I have a Mozilla plugin that can embed random applications in the browser in order to display file types that aren’t directly supported by the browser or a plugin. Those applications obviously run as separate processes.

  23. Phaeron says:

    Chris Lundie: I wrote an AVIFile handler once that would cause the same symptoms if you manually enabled its "proxy" mode, until I found the handle leak. The XP shell media extension is a magnet for third-party bugs, and unfortunately problems with it will extend into the common file dialog too. :(

    Speaking of unwarranted whacking of global settings, for some reason this story reminds me of when I used to use Windows 95. Seems a lot of driver writers at that time couldn’t find a good way to avoid the "browse for file" dialog on driver installation other than to change the cached Windows CD path to their install folder. You’d then blame Windows for being stupid for trying some weird directory in TEMP to install TCP/IP support.

  24. Aaargh! says:

    "Windows Explorer really needs the same thing, an option to put every invocation in a separate process."

    That would be a great improvement indeed. Another thing that would greatly improve Explorer is making it multi-threaded, looks like it isn’t at the moment or at least threads block eachother for no reason.

    Try opening a SMB share on a computer that doesn’t respond, explorer completely freezes while waiting for the timeout, the UI doesn’t even redraw. Is there a seperate UI thread and if so why is it blocked by the filesystem-thread ?

  25. Goran Pušić says:

    Hilarious! You made this geek’s day :-))

  26. Moi says:

    l – Hanlon’s Razor: Never attribute to malice that which can be adequately explained by stupidity.

  27. Eric says:

    I’m wondering how .Net manage this affinity mask… Somebody know ?

  28. Jonathan Wilson says:

    I too have seen explorer freeze, really slow down or hang because one window is hung doing something (e.g. accessing a slow/non existant network share or whatever else)

    An option (or feature) so that each explorer window is a different process (or thread) and one hung window wont hang the other explorer windows would be very usefull.

    Although I am sure that the explorer/shell guys at Microsoft have good reasons for not implementing this.

  29. dhiren says:

    http://blogs.msdn.com/oldnewthing/archive/2004/10/13/241725.aspx

    Raymond answered this when I asked it in the suggestion box a while back:

    (quoting the relevent piece of the article)

    "Why is explorer.exe monolithic? Why wasn’t there a desktop.exe, taskbar.exe, etc?"

    Processes are expensive.

    So there you have it. If each instance of explorer was in its own process, the system would crawl (or so I imagine)

  30. Aaargh! says:

    "So there you have it. If each instance of explorer was in its own process, the system would crawl (or so I imagine)"

    Maybe that was true on a p100 running win95. But why is it still there ? With all the CPU time spent on visual gadgets in winXP, why not spend a little on fixing explorer ?

    I’m not saying everything should be it’s own process, but a multithreaded explorer would be nice, so if a filesystem operation doesn’t respond I can at least cancel it and move on.

    btw, Konqueror on my linux system does it the right way, and I haven’t seen performance problems with it at all.

  31. vince says:

    Aargh said: "btw, Konqueror on my linux system does it the right way, and I haven’t seen performance problems with it at all."

    Well, on Linux creating processes is cheap. In fact, having a fast fork() is a major goal of the kernel design team. Wheras on windows, as dhiren said above, "Processes are expensive".

    It’s a remnant of the assumptions that the designers built into the systems years ago, and with the advent of multi-processor systems the Linux decision is paying off.

  32. Raymond Chen says:

    Um, Explorer does put each window on its own thread. Fire it up under your favorite debugger if you don’t believe me.

  33. Joku says:

    What about having these 3rd party extensions run in some virtual explorer process? Rename the real one to something else etc. And have some stuff to manage what stuff has been installed into the virtual explorer. Oh and add a managed API for doing the most common extension stuff in XP too..

  34. Aaargh! says:

    "Um, Explorer does put each window on its own thread. Fire it up under your favorite debugger if you don’t believe me."

    I believe you, the problem is not each window having a different thread, the problem is that there are no seperate threads for the UI and the filesystem stuff. So a non-responding network FS also freezes the UI.

  35. Raymond Chen says:

    Explorer tries to do heavy filesystem stuff on background threads but sometimes it messes up. But this has drifted far off-topic so I’ll let it go.

  36. Sriram says:

    This is slightly offtopic – but wouldn’t it be a good idea for Explorer to have add-in management like IE 6 SP2? I’ve seen tons of people have problems with shell extensions – and since it isnt obvious as to how you can remember them, an options UI which lets you turn on and off individual extensions (everything from icon handlers to namespace extensions) might be a usual feature

  37. Raymond Chen says:

    Yes, I read the top half. My point was that you extrapolated from the first half to the second half, concluding that "Microsoft" told you to do something, as if all of Microsoft agreed on the recommendation you received.

  38. David Walker says:

    "When you say that "Microsoft" told you to do something, you need to be more specific which group was doing the talking."

    Raymond, did you read the last half of my post and ignore the first half? I was completely explicit in the post as to who within Microsoft "told" me to do something. I even included the link in the post: I was quoting from http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp

    You could probably tell better than I which group within Microsoft authored the content, if that’s what you want me to tell you, but I thought that including the link at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp it would provide 100% of the relevant information. I’m sure it was some part of the SQL server group. That part is obvious from looking at the link that I provided, which is http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp

    Yes, I did say in the post "there are cases where we’re told by Microsoft to run everything in an application on the same thread" when I could have reiterated that the example I posted was from the link which I included in its entirety, at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp“>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dtsprog/dtspapps_3xur.asp

    David Walker

  39. David Walker says:

    Aargh! I don’t know why the link repeated itself each time I included it, I didn’t do that on purpose. I would use a Preview button if the forum software had one…

  40. Justin Olbrantz says:

    "Well, on Linux creating processes is cheap."

    I’d read that in several pieces of literature (rather, that Unix is far superior to Windows as far as process efficiency, such as this), but I didn’t really know the technical details. Can anybody recommend some literature about how processes differ between Unix and Windows, and why Unix is so much better? Sorry for getting further off topic, but with my attention span, if I don’t ask now, I’ll forget about it (like I’ve done in the past on this topic) :P

  41. Ben Hutchings says:

    Justin: Arguably, process creation is cheap on Unix because it needs to be cheap. Concurrency and communication between programs are normally done with separate processes. Windows (in its 32-bit versions) was written with the assumption that programs would normally use multithreading for concurrency and DLLs for extensibility, so there wasn’t the same concern about the cost of creating processes.

  42. Wesha says:

    Another perfect example of "sweeping the dust underneath the carpet" approach. Don’t know how to fix a pesky problem? Hide the visible sympthoms!

  43. Jeff Atwood says:

    I thought the reason processor affinity inherited was obvious – think about it. But apparently it’s not obvious enough. I’ll add it to the list of future topics.

    Isn’t it because cross-processor communication is extremely expensive? Thus you would typically want all the "children" of a given process to inherit the same CPU– unless you have specific reasons for them to be on other CPUs.

    Everything’s a bandwidth problem, if you dig deep enough..

  44. Tech support dude says:

    We’ve experienced many similar problems at the wetware level. Our software has extremely fine-grained internal locking, and has been pretty heavily tested on multiprocessor systems. It turns out that users are really good at creating rendezvous’ by trying to share single mutex-protected objects across all threads on all CPUs, creating pileups of threads behind the object and reducing the overall performance to that of a single-CPU system. Further questioning of various users who were having problems revealed that a large number of them really didn’t understand threads, ranging from complete incomprehension of the whole concept ("What, you mean you can have two thingies executing as part of the same program? Naaahhh, pull the other one") to simply not understanding how to manage objects in the presence of multiple threads. Because of this, our documentation now includes a Threads for Dummies-style section right at the start to tell programmers what threads are, how they work, how to manage objects in the presence of multiple threads, etc etc.

  45. Stefan Kanthak says:

    Ryan Myers wrote:

    > And much, much slower due to the

    > massive number of context switches

    > that result — which is why it’s

    > off by default. (Plus app compat

    > reasons.) But it’s there.

    Please explain this "much, much slower" to your MSFT fellows who are proposing to run LUA: if the switch ain’t set you won’t be able to execute a second EXPLORER.EXE with administrative rights at all!

    So running in different processes is REALLY essential for running with different credentials.

    Here "speed" doesn’t matter,but security.

    I’m working with this switch for years and never found it to slow down my system(s).

  46. Raymond Chen says:

    I would have written "The DTS team at Microsoft says to run everything on one thread because of a conflict with VB." It’s a workaround not a preferred configuration.

  47. David Walker says:

    You’re right, Raymond, I should have clarified my comment by saying that "Microsoft in this case with this software says to run everything on one thread".

  48. То есть, как не надо фиксить баги…

  49. I thought I’d share a support story with you from a very interesting case I have. My customer is running

  50. Dan McKinley says:

    Here’s a problem:You have an application that’s hanging permanently or temporarily.The hang does not…

  51. Vor kurzem ist mir aufgefallen, dass einige Prozesse auf meinem Core2 Duo Windows XP nur einen Prozessorkern nutzen. Ueber den Task Manager – dort mit Rechtsklick auf den Prozess und "Zugeh�rigkeit festlegen" bzw "Set affinity" – stellte ich fest, dass

  52. Last week I've resolved a simple "debugging" case by phone, and figured that it might benefit

Comments are closed.