What makes a valid Windows file name?


A common question for people starting to program on Windows is, “What makes a valid Windows file name?” You want to use this information to make simplifying assumptions in your code: that names can be no longer than MAX_PATH, that two names won’t differ only by case, etc. Unfortunately, the answer to what makes a valid file name in Windows is not simple.


Due to the layering of Windows architecture, the definition of a “legal” file name may vary depending upon the component of the operating system you are dealing with.


·          NTFS and the Posix subsystem have the most permissive definition of a “legal” name. The name may be up to 32,768 Unicode characters long. The name can contain trailing periods, trailing spaces, and two files may have names that differ only in case (e.g., README.TXT and readme.txt).


·          The Win32 subsystem enforces additional constraints on legal file names. The name can be at most MAX_PATH characters long (defined in windef.h as 260 characters), may not have trailing dots or spaces, and file names are case preserving, not case sensitive — if two files exists with names that differ only in case, you will only be able to manipulate one of them through Win32 APIs.


·          DOS and 16-bit Windows applications are still limited to “8.3” names.


See Inside Windows 2000, pages 729ff, for more information on the different constraints on file names.


These differences have practical consequences for any code that attempts to manage files that could be created by another program. If your management code uses DOS (heaven forbid!) or Win32 APIs to manipulate files, it is possible for the untrusted program to create files that your program cannot open or manipulate. For example, a user connected to Posix-based FTP server could create files with file names longer than MAX_PATH. If the administrator uses a Win32-based program to manage the FTP upload directory, then he will not be able to open, delete, or otherwise manipulate the files with long file names.


If you are writing a Win32-based program that manages arbitrary files, consider prepending “\\?\” to the start of file names before you call CreateFile( ), DeleteFile( ), RenameFile( ), etc. This escape sequence at the start of a file name instructs the Win32 subsystem to bypass its normal name checking functions, and you will be able to use any valid NTFS name from your Win32 program.


 


Comments (9)

  1. Aspirer says:

    You got a blog fan allready. I will put a link to your blog soon:)). and thanks for the pretty handy tips.

  2. ericnewton76.at.hotmail.com says:

    Why does the POSIX subsystem get longer max paths? I’ve had an issue where an open FTP site was letting hackers/freakers/whatever store movie files on the ftp (nt-ftp hosted). They essentially created files that to this day i wasnt able to delete without reformatting the drive… and I couldnt just leave them because it was a ripped Spiderman movie (a couple of days before it actually came out) and occupied 1.2GB, times 5 for the 4 other movies.

    It was the most absurd thing that explorer couldn’t delete a file, in the file system. and "del spiderman*" wouldnt work either…

  3. billg says:

    ya da da

    ya da da-dah!

    ya da da

    ya da da-dah!

    windows, so broken, why do I bother?

    windows, so broken, why do we care?

  4. rmorris says:

    Hey all, this is great information. I was wondering if anyone reading this has knowledge that could help me with a problem I’m having.

    I’m developing a file-based settings system for an application I’m writing. To support multiple windows user accounts, I’m creating a separate file for each user (ie WinXP profiles) in a shared directory.

    To do this, I’m creating a "<user name>.dat" file. The problem I’m having is, some windows user names (as returned from GetUserName() in the Win32 api) are invalid file names. Any ideas? I can take invalid chars and "safe" them, eg "bob*foo" -> "bob_foo". But I’m not sure what characters I need to safe.

    Other ideas?

  5. Paul says:

    FYI, the system puts a lot more restrictions on valid file names than is mentioned above or listed in the Windows book. I don’t have time to go into it in any detail, though.

  6. Trey wacker says:

    cute euphemism, "Due to the layering of Windows architecture"

  7. Broken Molds says:

    rmorris,

    Any easy solution to your delima is to create a 32-bit hash from the user name, and use that for your file name. MD5 would work great for this.

    -Jason