Recursively Deleting a directory–with long filename support.


I recently was updating some test code to handle long filename (longer than MAX_PATH) support.

My initial cut at the function was something like the following (don’t worry about the VERIFY_ macros, they’re functionally equivalent to asserts):

const PCWSTR LongPathPrefix=L"\\\\?\\";

void RecursivelyDeleteDirectory(const std::wstring &strDirectory)
{
    //  Canonicalize the input path to guarantee it's a full path.
    std::wstring longDirectory(GetFullPath(strDirectory));

    //  If the path doesn't have the long path prefix, add it now before we instantiate the
    //  directory_list class.
    std::wstring strPath;
    if (longDirectory.find(LongPathPrefix) == std::wstring::npos)
    {
        strPath = LongPathPrefix;
    }
    strPath += longDirectory;
    strPath += L"\\*";

    directory_list dl(strPath);
    for (const auto && it : dl)
    {
        std::wstring str(longDirectory+L"\\"+it);

        //  It’s possible that the addition of the local filename might push the full path over MAX_PATH so ensure that the filename has the LongPathPrefix.
if (str.find(LongPathPrefix) == std::wstring::npos) { str = LongPathPrefix+str; } DWORD dwAttributes = GetFileAttributes(str.c_str()); VERIFY_ARE_NOT_EQUAL(dwAttributes, INVALID_FILE_ATTRIBUTES); // Check for error. if (dwAttributes & FILE_ATTRIBUTE_DIRECTORY) { if (it != L"." && it != L"..") { RecursivelyDeleteDirectory(str); } else { VERIFY_WIN32_BOOL_SUCCEEDED(DeleteFile(str.c_str())); } } } VERIFY_WIN32_BOOL_SUCCEEDED(RemoveDirectory(longDirectory.c_str())); }

The weird thing was that this code worked perfectly on files shorter than MAX_PATH. But the call to GetFileAttributes failed 100% of the time as soon as the directory name got longer than MAX_PATH. It wasn’t that the GetFileAttributes API didn’t understand long filenames – it’s documented as working correctly with long filenames.

So what was going on?

I wrote a tiny little program that just had the call to GetFileAttributes and tried it on a bunch of input filenames.

Running the little program showed me that \\?\C:\Directory\FIlename worked perfectly. But \\?\C:\Directory\. (note the trailing “.”) failed every time.

It took a few minutes but I finally remembered something I learned MANY decades ago: On an NTFS filesystem, the “.” and “..” directories don’t actually exist. Instead they’re pseudo directories inserted into the results of the FindFirstFile/FindNextFile API.

Normally the fact that these pseudo directories don’t exist isn’t a problem, since the OS canonicalizes the filename and strips off the “.” and “..” paths before it passes it onto the underlying API.

But if you use the long filename prefix (\\?\) the OS assumes that all filenames are canonical. And on an NTFS filesystem, there is no directory named “.”, so the API call fails!

What was the fix? Simply reverse the check for “.” and “..” and put it outside the call to GetFileAttributes. That way we never ask the filesystem for these invalid directory names.

Comments (9)

  1. Karellen says:

    Do FAT filesystems, e.g. as still found on USB sticks, or ISO-9660 filesystems, or UDF filesystems have "real" "." and ".." entries? So would the above code work on directories stored on those filesystems?

    If so, it seems kind of ugly to put the "." and ".." hack for NTFS in the Find{First,Next}File API and the non-long-filename codepath(s) above the VFS layer. Why not put the hack in the NTFS driver instead, and make the observed behaviour consistent within the OS and with other filesystems?

  2. FAT filesystems have real "." and ".." characters. However I don't know if they support filenames longer than MAX_PATH (theoretically they might be able to).

    Here's the thing. Realistically "." and ".." are artifacts of the Win32 API layer. They're not a guaranteed part of the filesystem. When you use \? you are telling the OS "I am no longer using Win32 filename semantics, I want to use the filename semantics of the underlying filesystem". And in the case of NTFS, "." and ".." are invalid directory names. It's also important to realize that special casing "." and ".." works with *all* filesystems, NTFS or FAT (or any other filesystem with *nix style directory name conventions).

    When you use \? you're essentially taking the training wheels off. Part of that means that you have to know what you're doing.

  3. Karellen says:

    Whether or not FAT filesystems support filenames longer than MAX_PATH, FAT filesystems can still be accessed with \? paths.

    But still, it seems odd to not have a unified set of filename semantics, because that means that if you simply need to support long paths, you need to know how to handle the filename semantics of *every possible* underlying filesystem your code might run on – including those of filesystems that haven't been written yet!

    Compared to that, it would seem to make much more sense to require filesystems to support a common set of semantics, and require that the NTFS driver handle "." and ".." entries whether or not they exist on disk.

    Also, I'd note that you had *already* special-cased the "." and ".." entries in your original code. Just not in a way that worked, due to the inconsistent semantics.

  4. You have to special case "." and ".." in a recursive deletion algorithm, otherwise you either recurse into the current directory or you walk up the directory hierarchy to the top. That's just a consequence of "." and "..".

    The only change necessary is a mental switch to consider the "." and ".." directory entries as pseudo-directory entries and not attempt to perform filesystem operations on them. If I had called PathCchCanonicalize on the path (as recommended when using \?) they would have been stripped, but doing that would have led had its own set of issues. Moving the special case was far easier.

  5. Peter says:

    "When you use \? you're essentially taking the training wheels off. Part of that means that you have to know what you're doing."

    Thanks for this post, Larry.  The MSDN docs make it seem like "\?" is just a way to circumvent the MAXPATH length limitation and don't make the broader implications clear at all.

  6. mvadu says:

    welcome back to blogging Larry! its been a while..

  7. Bartosz Wójcik says:

    Did you get out of jail Larry? I haven't seen a post from you since when, 2012? 🙂

  8. Karellen says:

    OK, I've been looking around a bit, and I was wondering if there's a long-filename equivalent of Find{First,Next}File() which only returns the entries which exist in a directory – i.e. would not return "." and ".." entries on NTFS filesystems. I can't find one. Have I just been looking in the wrong place, or not hard enough, or is there no such API?

    (Yes, I know I still need to deal with race conditions where a file is reported as existing, but is then deleted (and maybe even created anew) by another process before I get around to accessing it.)

  9. Karellen: There is no such long-filename equivalent. The problem is that there are too many apps that depend on FFF/FNF returning the "." and ".." entries for such an API to be successful. The FindFirstFileEx API supports long filenames (FindFirstFile is not documented as supporting long filenames).

Skip to main content