Your profiling tools can manufacture performance issues where there were none


When analyzing the performance of a program, you must be mindful that your performance analysis tools can themselves affect the operation of the system you are analyzing. This is especially true if the performance analysis tool is running on the same computer as the program being studied.

People often complain that Explorer takes a page fault every two seconds even when doing nothing. They determine this by opening Task Manager and enabling the Page Faults column, and observing that the number of Page Faults increases by one every two seconds.

This got reported so often that I was asked to sit down and figure out what's going on.

Notice, though, that if you change Task Manager's Update Speed to High, then Explorer's page fault rate goes up to four per second. If you drop it to Low, then it drops to one every four seconds.

If you haven't figured it out by now, the reason is that Task Manager itself is causing those page faults. Mind you, they are soft faults and therefore do not entail any disk access. Every two seconds (at the Normal update rate), Task Manager updates the CPU meter in the taskbar, and it is this act of updating the CPU meter that is the cause of the page faults.

No Task Manager, no animating taskbar notification icon, and therefore no page faults from Explorer when idle.

(A similar effect was discovered by Mark Russinovich when he found that Process Explorer's polling calls to the EnumServiceStatusEx function was triggering repeated registry access.)

Comments (14)
  1. herd says:

    Is it that each Icon update in the tray causes a page fault? If so, why on earth?

    And to make this entry complete, I’d like to hear why GDI+ apps cause hundreds of page faults when drawing an image.

    In the old days, a page fault at the wrong time would take down the entire system. Since NT, an application that became page fault millionaire in a reasonable time was surely badly programmed: Fragmenting the heap by too many new/delete or malloc/free in a loop or went into the famous CString trap.

    If I had a say at MS I’d demand to keep this profiling helper intact. It was very useful.

    wkr,

    herd

  2. Centaur says:

    Ahh! That’s it. The Heisenberg Principle of performance profiling. As soon as you add code that measures time spent by different functions in the program, it starts spending half the total time :)

  3. For those using Task Manager, kill that process and use Performance Monitor instead (under the Administrative Tools menu). It adds less overhead.

  4. ReuvenLax says:

    There’s something missing in this explanation. Why out of all the processes on the system is explorer.exe the only process that shows this behavior? I just tried it on my machine, and no other process showed this steadily-increasing page-count.

  5. rickbrew says:

    ReuvenLax, Because Explorer.exe is responsible for maintaining the tray icons. Task Manager updates its tray icon every time it refreshes.

  6. oldnewthing says:

    It’s the same code but a completely different icon each time.

  7. ReuvenLax says:

    Oh, of course. I didn’t properly read Raymond’s original article.

  8. Sean Barrett says:

    The implication here is that there is nothing wrong with these page faults: they are supposed to happen, they aren’t going to disk, they aren’t a performance issue.

    If that’s the case, then isn’t the task manager measuring the wrong thing?

  9. Tim Smith says:

    (very simplified)

    A soft page fault comes from main memory where a hard fault comes from the drive.

    Each process has a working set. When they need more pages than their working set allows, "old" pages will either be moved to the free list or the modified list. If those pages haven’t been dumped to be used for other pages, then they can be faulted back into the original process. This is a soft fault. If the page must be read from the drive, then it is a hard fault.

    Soft faults are a fact of life and in general a good thing.

  10. Bryan says:

    But then, echoing what herd said, the question becomes:

    *Why?*

    Why does repainting the tray icon cause a pagefault in explorer.exe? Refreshing an on-screen bitmap should be a fairly simple process, unless I don’t understand everything that repainting an icon involves. It should be a simple memory block transfer (or perhaps a set of them). The target block of memory already exists, and the source block is given to the notification API call by the taskmgr.exe process, so it must exist also.

    So why is the pagefault happening?

    On a related note, I’m not sure what a "soft" pagefault is, compared to a "hard" one. Is it just that a "soft" one doesn’t cause a page to be loaded from disk? Is this difference explained somewhere online (perhaps in MSDN)?

  11. oldnewthing says:

    "Why does repainting the tray icon cause a pagefault?"

    Read the article again. It’s not the repaint that’s causing the pagefault. It’s the *update*. Every two seconds, task manager says "Hey, get rid of the old icon and use this new icon instead". Icons have their own subtleties. I’ll cover them next year.

  12. microbe says:

    "It’s not the repaint that’s causing the pagefault. It’s the *update*."

    Repaint or update, the point is it’s the same code that is always executed, why would invoking it cause page fault?

    If I keep calling the same function, I don’t expect it to generate page fault (assuming it reads/writes the same memory area too).

    It seems explorer has some weird memory access patterns that trigger these extra page fault.

    The bottom line is that you haven’t given enough explanation.

  13. foxyshadis says:

    "they aren’t a performance issue. "

    Um, task manager is only measuring a subset of system events, it’s not a performance monitor (though it can be used as one in a crude way). There’s a tool actually named performance monitor that has far more granular and useful details, though it still only tells you what’s happening, not what it means.

    Soft faults can still cause problems in high-performance tools (like audio/video processing), but typically don’t matter. But just because it’s not useful from a process profiling perspective doesn’t mean it’s not useful at all.

  14. To anyone not getting it: Explorer draws the icons, not the applications. The applications create an icon and they send it to Explorer (internal shell magic here – IIRC it’s a WM_COPYDATA). It’s simpler this way, no synchronization needed, and it’s also why icons aren’t cleaned up immediately when the owner terminates (they are as soon as Explorer cannot find their owner window anymore). When changing the icon you could probably rewrite the icon’s bits directly, since icons are shared objects, but it’s not guaranteed to work, not to mention very impolite

Comments are closed.