Superstition: Why is GetFileAttributes the way old-timers test file existence?


If you ask an old-timer how to test for file existence, they'll say, "Use GetFileAttributes." This is still probably the quickest way to test for file existence, since it requires only a single call. Other methods such as FindFirstFile or CreateFile require a separate FindClose or CloseHandle call, which triggers another network round-trip, which adds to the cost.

But back in the old days, the preference for GetFileAttributes wasn't just a performance tweak. If you tried to open the file to see if it existed, you could get the wrong answer!

Some network providers had a feature called a "data path". You can add directories to the data path, and if any attempt to open a file failed, the network provider would try again in all the directories in the data path. For example, suppose your data path was \\server1\backups\dir1;\\server1\backups\dir2 and you tried to open the file File.txt. If the file File.txt could not be found in the current directory, the network provider would try again, looking for \\server1\backups\dir1\File.txt, and if it couldn't find that file, it would try again with \\server1\backups\dir2\File.txt.

All this extra path searching happened behind Windows's back. Windows had no idea it was happening. All it knew is that it issued an open call, and the open succeeded.

As a result, if you used the "open a file to see if it exists" algorithm, you would get the wrong result for any file that existed on the data path but not in the current directory. These network providers didn't search the data path in response to GetFileAttributes, however, so a call to GetFileAttributes would fail if the file didn't exist in the current directory, regardless of whether it existed on the data path.

I don't know if any network providers still implement this data path feature, but Windows code in general sticks to GetFileAttributes, because it's better to be safe than sorry.

Comments (45)
  1. andy says:

    In general this behavior seems to be confusing: you save a file to somewhere in the “data path”. Later you try to access it, but Windows returns “file not found” instead.

    So, I’m wondering if some context, or more background, is missing here? But there is a good chance that I’m just stupid too :)

    [Not sure which behavior “this behavior” is. If you save a file to c:dirfile.txt and try to access it as c:dirfile.txt, it will succeed. What’s so confusing about that? -Raymond]
  2. Tim Smith says:

    Wow…  I’ve been using that trick for years for the wrong reason.  :)

    It is still a cute performance tweak.

  3. Dave says:

    The "data path" adds uncertainty to a file’s real location, similar to the way Vista’s "folder virtualization" does. :-) There isn’t an easy way out of the "writing to Program Files" mess.

  4. That sounds more like a "best practice" than a "superstition."

  5. manyirons says:

    I must be really old, because I always used access() to determine file existence.  Now I have to add a stupid leading underscore which makes it look spiky, but it still seems to work.

  6. Tim Smith says:

    _access uses GetFileAttributes. :)

    Many old fart Windows programmers avoid using the CRTL for things that can easily be done in Win16/32 to avoid the extra code.

  7. me says:

    As long as you don’t use access/GetFileAttributes to check for file existance before opening the file (yes, I’ve seen crap like this).

  8. Illuminator says:

    I’m only 27, does using GetFileAttributes still make me an old fart?

  9. Bill Arnette says:

    Hmmm…what are we who use PathFileExists()

  10. Wow, this "data path" stuff sounds scary.  I sure hope it’s a thing of the past.

    Thanks for sharing this interesting piece of history Raymond.

  11. Wow, Windows 2003 still has APPEND.EXE in it, which has been doing exactly this since DOS 2 (probably).

  12. rick says:

    We used to use GetFileAttributes and switched to CreateFile, cause GetFileAttributes lies.

  13. Arno says:

    We used to use GetFileAttributes and switched > to CreateFile, cause GetFileAttributes lies.

    I am curious: when does it lie?

    Arno

  14. waleri says:

    >> We used to use GetFileAttributes and switched to CreateFile

    And how about permissions?

  15. Tim Lesher says:

    >> That sounds more like a "best practice" than a "superstition."

    What’s the difference between a "best practice" and a superstition?

    My definition: if you can remember the reason you do it, and you’ve validated that the reason is still valid, then it’s a "best practice."

    Otherwise, it’s superstition, with varying degrees of probable truthiness.

  16. poochner says:

    The data path concept is analogous to the execution path in %PATH%, with the exception that it was/is specific to a given network provider and worked for any kind of file open.  And confused people no end, slowed things down, and caused more trouble than it was worth.

  17. jared says:

    Rick,

    I also am curious: when does GetFileAttributes lie?  Do you mean that you want to check if the file is accessible?  Because in this case, I could see using CreateFile so you would have the handle.  

    Typically, I don’t care if I have access… instead I just want to see if the file exists so I call GetFileAttributes

    jared

  18. Zirak says:

    Isn’t using GetFileAttributes to check for file existence almost always a race condition?

    You rarely want to know does_file_exist_right_now.  Almost always, you want to know open_file_if_it_doesnt_exist or open_file_if_it_exists.

    Does someone have an example of GetFileAttributes usage to check for file existence, that doesn’t lead to a race condition?

  19. Jim says:

    I remember those data paths (roughly analogous to DOS’s PATH environment variable). I can remember they made it merry hell trying to fix problems related to file versions/missing files etc.

    Also thinking back it seems as though there’d be quite a potential for things like escalation of privilege/trojan horse type attacks if your admin isn’t very careful about setting the folder permissions to all those places on the data path? Perhaps one reason why you don’t hear of them anymore.

  20. A. Skrobov says:

    Zirak: for UI, e.g. to disable buttons that would result in a "File not found" message otherwise.

  21. Zirak says:

    A. Skrobov,

    Zirak: for UI, e.g. to disable buttons that would result in a "File not found" message otherwise.

    But those buttons must do something, presumably with the contents of the file.  Rarely would UI care if the file exists at that specific point in time.  So shouldn’t the program just CreateFile instead?  GetFileAttributes is another API that returns an answer that is immediately out of date.

    You should be happy, Raymond, I learned this line of thinking from reading this blog!

  22. ender says:

    Wow, Windows 2003 still has APPEND.EXE in it, which has been doing exactly this since DOS 2 (probably).

    Heh, I just noticed it’s there in Windows XP x64 – as a DOS executable (which doesn’t work in x64; there’s also sysedit.exe, which is a 16bit NE executable). Remnants of the past?

  23. Mo says:

    APPEND.EXE will still exist for compatibility. It’ll more than likely do nothing for Win32 applications, but other DOS apps (possibly only others in the same VDM? I’d have to test to find out) would presumably be affected by it.

    I do remember confusing Win95 with the help of SUBST (one of the most useful utilities ever, back in the day) a long time ago. I can’t remember how NTOS-based systems behave (I’m guessing the same as I outlined with APPEND)…

    Or does the VDM do something sneaky and clever?

  24. andy says:

    With regards to my comment at the top:

    "this behavior" was to be understood as a user saving a file somewhere and when later trying to access it received a "file not found" error, because GetFileAttributes doesn’t find it.

    But, good chance that I have misunderstood how the "data path" feature works. I was thinking more in the lines of symbolic links. After reading your reply, and the other comments, I see that I should’ve thought about the call sequence a little instead.

    Thanks!

  25. BryanK says:

    I *assume* that the data path didn’t come into play if you specified a full path to the file, though, right?  Because I can’t see any possible way to concatenate an element in the data path with either "\testserversharefile1.txt" or "c:whateverfile1.txt" that would make any sense at all.

    In both cases, you’d end up getting a file handle from a completely different machine than the one you’re expecting…

  26. Dean Harding says:

    Mo: SUBST.EXE works as expected. You even get a new drive letter in your "My Computer" folder!

    I don’t know how APPEND.EXE works… it doesn’t seem to exist in Vista (x64 at least)

  27. Dan says:

    This sounds just like the behavior of Windows and DOS with regards to the PATH environment variable… except PATH only applies to commands, not all files.

    From my complete lack of knowing anything about APPEND until now and my razor-sharp "programname /?" skills, APPEND looks closer in functionality to this behavior… although it is DOS only, as evidenced by lack of long file name support (run it from a long file name directory and you see Windows dump you into the equivalent short file name directory) and subsequent analysis:

    C:WINDOWSsystem32>trid append.exe

    TrID/32 – File Identifier v2.02 – (C) 2003-06 By M.Pontello

    Definitions found:  2637

    Analyzing…

    Collecting data from file: append.exe

    47.7% (.EXE) Generic Win/DOS Executable (2002/3)

    47.7% (.EXE) DOS Executable Generic (2000/1)

     3.5% (.TAR) TAR archive (149/5)

     0.9% (.CPT) Corel Photo Paint (41/41)

    After this I was hoping it was also a secret TAR archive, but it wasn’t. :(

  28. Worf says:

    I’m certain those executables were rewritten to use the right Win32 calls to simulate old DOS behavior. I.e., they really affect the Win32 environment. (SUBST does this via some mapping call in Win32 or other…)

    After all. many old DOS users used those commands for various tricks, and part of compatibility is user muscle compatibility.

  29. KeithMo says:

    (Slightly OT, but…)

    GetFileAttributes() is very fast — there’s a well-greased fast path between the I/O subsystem and the file system. This fast path allows IO to query the attributes without the overhead of creating a file object, creating a file handle, etc.

  30. Leo Tsarev says:

    Also we have another reason to use GetFileAttributes(..) — it faster. CreateFile(..) may involve pre-caching of file. Even if FS driver doesn’t pre-read file, it may decide to read FAT or B-extent. GetFileAttributes(..) require only a read from directory/МFT/i-node.

    I have a question: why we don’t have IsFileExists(..)? API will become clearer and more in point. Question "what should i use to check if-file-exists" will have only one answer. .Net have File.Exists(..) for example.

    Also reading file attribs may require in NTFS additional disk seek. IsFileExists must check only directory, and reading file attribs — MFT too

  31. SUBST says:

    > SUBST (one of the most useful utilities ever, back in the day)

    Is/Would be useful still today.

    Sadly it interacts badly with USB removable drives (win assigns the same letter for some unknown reason).

  32. bob says:

    @Dan:

    By looking APPEND.EXE binary I draw two conclusions: a) it’s an MS-DOS 5.0 executable and b) it is entirely coded in assembly language natively (by hand) and not with the use of a C or other high-level language compiler (wrong?)

  33. error doesn't compute says:

    The file could be gone before GetFileAttributes returns. To base any conclusions on a previous file existence is fundamentally wrong thing to do.

  34. Dean Harding says:

    error doesn’t compute: Like anything to do with race conditions like this, the reason for doing it is usually as a courtesy to the user. For example, the "File Open" common dialog has an option to check whether the file exists before returning. In this case, using GetFileAttributes makes sense (because it’s so unlikely that a file will dissappear from the user’s My Documents folder at that point).

  35. Neil says:

    > SUBST (one of the most useful utilities ever, back in the day)

    Is/Would be useful still today.

    Sadly it interacts badly with USB removable drives (win assigns the same letter for some unknown reason).

    I think it was possibly covered in an earlier blog post. The same issue happens with network drives. Something to do with subst and network drives being a per-user setting and so invisible to the drive letter allocator. That’s why XP maps network drives working down from Z: so that it’s less likely to conflict with removable drives working up from D:.

  36. Chris J says:

    Being a slightly older fart(?) I’ve always used _fstat() to test for existance, mostly as that’s you’d check in a UNIX env where I started. There’s nothing in the MSDN page for _fstat and its ilk that says whether GetFileAttributes() should be used, or indeed whether they are identical. Are there caveats with using _fstat() (and any other CRT call) over the native Win32 equivalent?

  37. Name required says:

    I’m still confused as to why I would want to call a function called CreateFile when I actually want to open an existing file.

    [I already answered this question. -Raymond]
  38. Xepol says:

    Actually, the worst does file exist routine I have seen is in Virtual Server 2005.

    To check if the save state file exists, it tries to create a folder of the same name.  Problem is that they don’t remove the folder if it gets created, and then, luck you, it can’t create the save state file and the virtual machine doesn’t start up.

    And unbelievably, the people who wrote that code WORK for Microsoft.

    [Norman, is that you? I don’t know what’s so unbelievable about that. People make mistakes. Microsoft employees are people. -Raymond]
  39. ulric says:

    There’s nothing in the MSDN page for _fstat

    and its ilk that says whether

    GetFileAttributes() should be used, or

    indeed whether they are identical.

    Hey, the source code of the C Run Time is all  included in VC++, so you can check there!  No need to ask.

  40. KJK::Hyperion says:

    Leo: CreateFile isn’t inherently faster from the point of view of the filesystem. If you only ask for FILE_READ_ATTRIBUTES access (instead of the GENERIC_READ catch-all), it should be about the same.

    The reason GetFileAttributes is much faster is that it performs open/read attributes/close in a single call to kernel mode, plus it short-circuits some internal steps (for example, the file object used in the whole operation isn’t a real file object that you could have a handle to, just a structure that resembles one closely enough)

  41. Igor says:

    Q: Why is GetFileAttributes the way old-timers test file existence?

    A: Because DoesFileExists() doesn’t exist.

  42. Xepol says:

    No, I’m not Norman.  

    It’s hard to believe because :

    a) The code is relatively new, so valid techniques for file exists are already well known.

    b) The method is SO destructive, it makes you wonder how it got missed.

    c) MS employees might be people, but they work for with the people who wrote the OS.  I would assume that not only are they floating in code examples, but they have a reasonable chance of asking someone who knows what they are doing how to accomplish something correctly.

    d) MS develops in teams, so basically this code was missed by EVERYONE who reviewed it, as a team failure.  

    So ya, MS employees are people too, people with more resources and expertise available to them than your average programmer and as such, the bar is set just a little bit higher than the rest of us.

    That said, even if the bar was set exactly level, code this bad in a production version is still surprising considering the beta cycle and team development setting.

  43. I was reading Raymond’s Superstition: Why is GetFileAttributes the way old-timers test file existence?

  44. GregM says:

    Q: Why is GetFileAttributes the way old-timers test file existence?

    A: Because DoesFileExists() doesn’t exist.

    That may have been true then, and is technically true now only because the function is called PathFileExists().

  45. strict says:

    This blog post doesn’t clarify if GetFileAttributes() is endorsed by microsoft to test for file existence.

Comments are closed.