Cancelling the INamespace­Walk::Walk operation a little faster


We saw last time that you can stop a INamespace­Walk::Walk operation by returning a COM error code from the Enter­Folder or Found­Item callback. However, that may not be fast enough.

I noted some time ago that if you're going to enumerate the contents of a directory, you'd best do it all at once. And that's what INamespace­Walk::Walk does. After it enters a directory, it enumerates the whole thing at one shot, and then (optionally) sorts it, and then calls the Found­Item method for each item that was found.

If you happen to enter a large directory, then the "enumerate the whole thing at one shot" step can take a while. But there's a way to sneak in during the enumeration phase and cancel the operation: Implement the IAction­Progress interface on your INamespace­Walk­CB object. Note that this works only if you do not pass the NSWF_SHOW_PROGRESS flag. If you pass the NSWF_SHOW_PROGRESS flag, then the progress dialog's Cancel button controls the cancellation.

Assuming you don't pass the NSWF_SHOW_PROGRESS flag, the INamespace­Walk::Walk method will call IAction­Progress::Begin to get the party started, and IAction­Progress::End when it's all over. In between, it will call IAction­Progress::QueryCancel. If your IAction­Progress::QueryCancel method returns *pfCancelled = TRUE, then the INamespace­Walk::Walk operation will abandon the enumeration, unwind all the entered folders with Leave­Folder, and then return HRESULT_FROM_WIN32(ERROR_CANCELLED).

Let's use this technique to cancel the INamespace­Walk::Walk operation a bit more quickly. Make the following changes to the program we had last time:

#define STRICT
#include <windows.h>
#include <shlobj.h>
#include <wrl/client.h>
#include <wrl/implements.h>
#include <stdio.h> // Horrors! Mixing stdio and C++!

namespace wrl = Microsoft::WRL;

class WalkCallback : public wrl::RuntimeClass<
  wrl::RuntimeClassFlags<wrl::ClassicCom>,
  INamespaceWalkCB,
  IActionProgress> // New interface!
{
public:
  // INamespaceWalkCB
  IFACEMETHODIMP FoundItem(IShellFolder *,
   PCUITEMID_CHILD) override
   { m_itemCount++; return TimeoutStatus(); }

  IFACEMETHODIMP EnterFolder(IShellFolder *,
   PCUITEMID_CHILD) override
   { m_folderCount++; return TimeoutStatus(); }

  IFACEMETHODIMP LeaveFolder(IShellFolder *,
   PCUITEMID_CHILD) override { return S_OK; }

  IFACEMETHODIMP InitializeProgressDialog(PWSTR *ppszTitle,
    PWSTR *ppszCancel) override
    { *ppszTitle = nullptr; *ppszCancel = nullptr;
      return E_NOTIMPL; }

  // IActionProgress - new interface!
  IFACEMETHODIMP Begin(SPACTION, SPBEGINF) override
  { return S_OK; }

  IFACEMETHODIMP UpdateProgress(ULONGLONG, ULONGLONG) override
  { return S_OK; }

  IFACEMETHODIMP UpdateText(SPTEXT, LPCWSTR, BOOL) override
  { return S_OK; }

  IFACEMETHODIMP QueryCancel(BOOL *pfCancelled) override
  { *pfCancelled = IsTimedOut(); return S_OK; }

  IFACEMETHODIMP ResetCancel() override { return S_OK; }
  IFACEMETHODIMP End() override { return S_OK; }

  int ItemCount() const { return m_itemCount; }
  int FolderCount() const { return m_folderCount; }

private:
  bool IsTimedOut()
    { return GetTickCount() - m_startTime > 1000; }

  HRESULT TimeoutStatus()
    { return IsTimedOut() ?
      HRESULT_FROM_WIN32(ERROR_CANCELLED) : S_OK; }

  DWORD m_startTime = GetTickCount();
  int m_itemCount = 0;
  int m_folderCount = 0;
};

int __cdecl wmain(int argc, PWSTR argv[])
{
  CCoInitialize coinit;

  wrl::ComPtr<INamespaceWalk> walk;
  CoCreateInstance(CLSID_NamespaceWalker, nullptr,
    CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&walk));

  wrl::ComPtr<IShellItem> root;
  SHCreateItemFromParsingName(argv[1], nullptr,
    IID_PPV_ARGS(&root));

  auto callback = wrl::Make<WalkCallback>();

  HRESULT hr = walk->Walk(root.Get(), NSWF_DEFAULT,
    100, callback.Get());

  printf("Walk completed with result 0x%08x\n", hr);
  printf("Found %d items and %d folders\n",
   callback->ItemCount(), callback->FolderCount());

  return 0;
}

All we did was add IAction­Progress support to our callback object. When asked if we want to cancel the operation, we report whether the operation has timed out.

Adding this extra support will not be noticeable when enumerating relatively small directories from relatively fast media.

Comments (15)

  1. florian says:

    I found that enumerating file system contents has always been reasonably fast, even on slow network drives. However, retrieving additional information, such as icons from SHGetFileInfo() and related interfaces, can take up considerable time, and I usually do this from a background thread. I always wondered if there’s a way to cancel a pending SHGetFileInfo() call, maybe a way to (non-sepcifically) cancel all pending IO for the background icon thread?

    1. Joshua says:

      while(thread hasn’t acknowledged shutdown command)
      CancelIoEx(thread handle)

      1. Koro says:

        This sounds almost as bad as closing handles you don’t own or forcibly terminating the thread.

      2. florian says:

        But CancelIoEx() takes a hFile that only SHGetFileInfo() knows …

        1. florian says:

          Ah, I see, it’s CancelSynchronousIo(hThread) that could end the pending IO requests inside SHGetFileInfo(). Would that be good practice, say, like Windows Explorer is doing?

      3. Antonio Rodríguez says:

        As Koro says, it is the moral equivalent of a user saying “this file copy is so slow, I’m going open the Task Manager and terminate Explorer.exe”. As we say in Spain, it’s using a cannon to kill a fly.

      4. Joshua says:

        Turns out this is a bad idea. I had neglected shell plugins.

    2. Antonio Rodríguez says:

      @florian: yes, SHGetFileInfo() is slow by default. It has to be. Reading a directory, even one with thousands of files, is a single operation on SMB. Extracting the icon requires opening the executable and reading the executable header, the resource directory and the actual icon. If the file is not an executable, in many cases you still have to check for alternate file streams or extra metadata that is not provide by the directory enumeration. These are several separate operations for each file. So you can expect it to be three or four orders of magnitude slower.

      But Raymond explained a few years ago how to trick SHGetFileInfo() into ignoring the actual file icon and metadata, and return the default icon looking only at the local registry database. Just pass the SHGFI_USEFILEATTRIBUTES flag. It means “just pretend the file exists locally and doesn’t have any strange attributes”, so it makes the SHGetFileInfo() call almost instant. See the details at https://blogs.msdn.microsoft.com/oldnewthing/20040601-00/?p=39073 .

      1. florian says:

        Thank you very much for all the interesting comments.

        I agree that calling CancelSynchronousIo() to abort SHGetFileInfo() seems like a terrible hack. At first I thought maybe it’s not quite as bad as calling TerminateThread(), but given that a thread also needs to call CoInitialize() before, and CoUninitialize() after using SHGetFileInfo(), there may be, well, … a lot of “undesired effects”.

        I’m aware of the differences between enumerating file system directories, querying “simple” file system meta data (such as file attributes), and retrieving advanced, possibly file content-dependent meta data (as does SHGetFileInfo()). I think that SHGFI_USEFILEATTRIBUTES is useful to get a quick initial view, say of a list view showing a directory. But once the background icon thread is running, it can only be terminated gracefully in between the calls to SHGetFileInfo().

        Here’s why this bothers me so much: on my (100-MBit) NAS, I keep backup copies of important downloads and disk snapshots. Many of the downloaded files are large SFX ZIP- or CAB-archives, with a .exe extension, and a custom icon (such as WindowsXPMode_N_en-us.exe, for example). SHGetFileInfo() takes much longer than I’d expect it would take to find and load the icon resource from such files, probably because Windows Defender checksums (and reads) the whole file, before allowing SHGetFileInfo() to read just the PE header, find the resource tables, and load the icon.

        To avoid the user having to wait for SHGetFileInfo() to return before navigating to a new directory on my NAS, it’s probably best to let the icon thread run as long as it needs (from some yet to be designed wrapper to take care of CloseHandle()’ing the thread handle when done), but synchronize access to the list view showing the directory, and have the icon thread check if it’s still on the same directory, before updating the list view icons retrieved through SHGetFileInfo().

        Or, disable the antivirus software? Is there any registry hack for this, anyone? Or maybe, buy faster LAN switches and mount new wires?

        1. florian says:

          I consider running my programs on slow infrastructure an interesting test to unveil if I have done something terribly wrong … That’s why I’ve been keeping an old PC with a floppy drive for a long time.

          Indeed, disabling Windows Defender makes SHGetFileInfo() return much faster in the case of the large SFX archive WindowsXPMode_N_en-us.exe. The lag it takes to calculate the checksum of this file with 3rd party command line utilities feels about the same as the lag caused by SHGetFileInfo() with Windows Defender enabled.

          So it may even be more complicated to cancel SHGetFileInfo(), as the antivirus checks are probably on another thread in another process, and “cancelling them” (in theory) would look like a malware attack?

          Anyway, what I don’t get is that 1) why the antivirus checks only seem to be cached for the lifetime of the current session, and seem to be run in full length after the next login, and 2) why SHGetFileInfo() does not simply return information from Windows Explorer’s Icon Cache (the system image list icon index), but instead seems to “read” the underlying file in a way that triggers the antivirus checks.

          1. Antonio Rodríguez says:

            I don’t have definite answers for your questions, but I can make an educated guess about the cache lifetime. A cache with a bad policy is the same as a resource leak. Windows (and any other general purpose OS) has to work on portable computers, that may be connected to the office’s network in the morning, to the home’s one in the evening, and to two different clients’ networks the following day when you are doing field service. It has to draw a line somewhere, and a session is a natural border.

            Disabling Windows Defender (or any other antimalware) via a registry hack should not be possible, at least not without elevation. Else, malware would do it to defeat its detection.

            I’m not sure you have many options. I’d try to reorder the calls to SHGetFileInfo() so the more expensive ones happen last. That would make the results show faster. First, I’d get the icon for every file that is not an executable (or maybe get all default icons using the SHGFI_USEFILEATTRIBUTES flag), and then I’d make a second round for obtaining the executables’ icons, smaller files first. It wouldn’t make it faster, but it would make it more responsive. Also, if you have kept the view open long enough for it to process the larger files, chances are you aren’t going to leave/close it soon, so the block would be less noticeable.

            You may also want to take into account which icons are actually shown in the current view and get their icons first. If you are listing a lot of files, chances are that the list view shows a scroll bar, and there is no hurry in getting the icon for a file that we aren’t showing right now.

            If you open in Explorer the System32 folder (or any other with hundreds of executables) and look carefully, you’ll see that Explorer seems to implement most of these suggestions (except for the ordering by size).

      2. skSdnW says:

        Go one step lower and call IExtractIcon::GetIconLocation with GIL_ASYNC and fall back to GIL_DEFAULTICON.

        1. florian says:

          Thanks for the input.

          The current flow is:

          Background thread → call SHGetFileInfo() → wait for background thread to return from SHGetFileInfo() before clearing the directory list view and navigating to a new directory.

          With IExtractIcon, things would go like this:

          call IExtractIcon::GetIconLocation(GIL_ASYNC) → returns E_PENDING → background thread → call IExtractIcon::GetIconLocation(~GIL_ASYNC) → call IExtractIcon::Extract() → wait for background thread to return from IExtractIcon::Extract() before clearing the directory list view and navigating to a new directory.

          This would produce the same lag that bothers me right now, I suppose. I should not wait for the background thread to finish, before enabling navigation to a new directory.

          Windows Explorer exhibits the same lags when extracting icons for a directory full of large SFX archives (can also be experienced on a new machine, with the latest OS and updates, and SFX archives stored on the local hard drive, so it’s not just a network latency/caching issue), but it allows navigating to another directory, or closing the window, without waiting for the pending icon extractions. It’s really amazing how Windows Explorer handles these kinds of async background tasks, keeping the UI light-weight and responsive.

          1. skSdnW says:

            No, don’t wait, use GIL_DEFAULTICON to get a plain icon or if that is to slow, just cache the index of the folder and plain document in the system image list at startup and use those while populating the file list initially. Then you can just update the image index of each item as they complete on the background thread.

  2. florian says:

    So, as a summary, here’s the methods to find the icons associated with files and folders (follow either path A or B):

    Get plain/default icons for a quick initial view:

    A1. SHGetFileInfo(SHGFI_USEFILEATTRIBUTES)
    B1. IExtractIcon::GetIconLocation(GIL_DEFAULTICON) → IExtractIcon::Extract()

    Get “fast” final icons:

    B2. IExtractIcon::GetIconLocation(GIL_ASYNC)==S_OK → IExtractIcon::Extract()

    Get “slow” final icons:

    B3. IExtractIcon::GetIconLocation(GIL_ASYNC)==E_PENDING → IExtractIcon::GetIconLocation(~GIL_ASYNC)==S_OK → IExtractIcon::Extract()

    Get “fast” and “slow” final icons:

    A2. SHGetFileInfo(~SHGFI_USEFILEATTRIBUTES)

    Steps B3. and A2. should run on a background thread.

    Both path A and path B would take the same total time to find all final icons, I would guess. But path B might make the UI feel quicker, in some cases.

    I feel that the order to retrieve the final icons (Windows Explorer-style: visible-first and visible-only vs. whole list view top-down following path A) does not impact usability too much, but the important thing is that the background tasks B3. and A2. must not block the UI.

    Did I get it?

    BTW: asking for a registry hack to disable antivirus checks was just kidding …

Skip to main content