Path Format Overview


As promised, here is the start of the look into paths on Windows. I'll keep things simple at first and layer on complexity in additional posts. In this post we'll look at a top level overview of the path formats in use in Windows.

DOS Paths

Let's start with the DOS path format that has been with us since DOS 2.0 (subdirectories did not exist in 1.0):

C:\Test\Foo.txt

The path is made up of three components that are broken up by the backslash character (technically a component separator, but usually called a directory separator). The first component is the volume, or drive. The second component is a directory name. The final component is a file name.

A fully qualified or absolute DOS path must be composed of at least a full volume name. For DOS paths this means a drive letter (a..z), a volume separator (:), and a directory separator. If it doesn't start with all 3 characters it is considered to be partially qualified or relative to the current directory in some way (or a prior current directory from another drive).

UNC Paths

The next path format came (via LanManager) from a need to access network resources, the Universal Naming Convention (UNC).

\\Server\Share\Test\Foo.txt

UNCs are identified by the fact that they start with two separators. The first component is the host name (server), which is followed by the share name. Server names can be NetBIOS machine names or IP/FQDN addresses (IPv4 as well as v6 are supported). The two together make up the volume. The rest of the path is the same as the previous path.

If a UNC doesn't contain a full server and share it is not relative, it is simply invalid. You can't set the current directory to a UNC (but you can, however, map a given UNC to a drive letter to use relative paths with shares).

DOS Device Paths

Windows NT (every current Windows OS is NT based) has a unified object model that points to all resources, including files. These NT object paths are not directly accessible from the Windows APIs (and consequently the CMD shell, file explorer, etc.). They are, however, exposed to the Win32 layer through a special folder of symbolic links that legacy DOS and UNC paths are mapped to. This special folder is accessed via the DOS Device path syntax, which is one of:

\\.\C:\Test\Foo.txt
\\?\C:\Test\Foo.txt

The \\.\ or \\?\ identifies the path as a DOS device path. The next component (C: in this case) is a symbolic link to the "real" NT device object. There is a specific link for UNCs called, not surprisingly, "UNC".

\\.\UNC\Server\Share\Test\Foo.txt
\\?\UNC\Server\Share\Test\Foo.txt

Like UNCs, DOS device paths are fully qualified by definition. Current directories never enter into their usage.

Terminology around DOS device paths and explanations of how they work are seriously lacking. I'll go into how these and all of the other path formats translate into the final NT path in later posts.

Up Next

Normalization. Most paths get normalized, which includes processing partially qualified paths and relative components (. and ..). Tune in next time for the deep dive.

References

Naming Files, Paths, and Namespaces (MSDN)
[MS-DTYP] 2.2.57 UNC
[MS-FCCC] 2.1.5 Pathname
[MS-FCCC] 2.1.5 Share name
[MS-FSA] 5 Appendix A: Product Behavior
MS-DOS 2.0: An Enhanced 16-Bit Operating System

Stupid DOS Tricks

A small fraction of the ways you can refer to the same file:

C:\>dir c:\test\foo.txt

 Volume in drive C is OS
 Volume Serial Number is 0000-0000

 Directory of c:\test

04/20/2016  07:00 PM                13 Foo.txt
               1 File(s)             13 bytes
               0 Dir(s)  56,278,192,128 bytes free

C:\>dir \\127.0.0.1\c$\test\foo.txt

 Volume in drive \\127.0.0.1\c$ is OS
 Volume Serial Number is 0000-0000

 Directory of \\127.0.0.1\c$\test

04/20/2016  07:00 PM                13 Foo.txt
               1 File(s)             13 bytes
               0 Dir(s)  56,278,192,128 bytes free

I'll spare you the output on the rest of these.
C:\>dir \\LOCALHOST\c$\test\foo.txt
C:\>dir \\.\c:\test\foo.txt
C:\>dir \\?\c:\test\foo.txt
C:\>dir \\.\UNC\LOCALHOST\c$\test\foo.txt
C:\>dir \\127.0.0.1\c$\test\foo.txt

Comments (4)

  1. Chris Guzak says:

    “Like UNCs, DOS device paths are fully qualified by definition. Current directories never enter into their usage.”

    in my experiment I also see that parent directory (..) and empty (.) segments don’t work either.

    1. JeremyKuhne says:

      The parent directory and current directory segments are evaluated as long as you don’t use \\?\ or \??\. \\.\ or //?/ (as it isn’t canonical) will hit the normalization.

  2. One question: what’s the notation for IPv6 addresses in UNC paths? I have two guesses:
    \\[::1]\c$\test\foo.txt (with square brackets, like URLs use to avoid any ambiguity about where the host part ends)
    \\::1\c$\test\foo.txt (with no brackets, because colons can’t appear immediately after the host in a UNC path anyway?)
    But I accidentally found out the answer when checking whether UNC paths can include a port number (or use a colon there at all): they can’t, but they chose the first option above anyway. (They can, however, use a colon or two after the filename to delimit an optional stream name and more-optional stream type.)

    And now I’m wondering how Windows decides what protocol to use for any given path, since apparently SMB isn’t necessarily the only option…

    1. JeremyKuhne says:

      Sorry, don’t know the logic there.

Skip to main content