This is the fourth in a series of blog posts on the Windows Subsystem for Linux (WSL). For background information you may want to read the architectural overview, introduction to pico processes and WSL system calls blog posts.
Posted on behalf of Sven Groot.
One of the key goals for the Windows Subsystem for Linux is to allow users to work with their files as they would on Linux, while giving full interoperability with files the user already has on their Windows machine. Unlike a virtual machine, where you have to use network shares or other solutions to share files between the host and guest OS, WSL has direct access to all your Windows drives to allow for easy interop.
Windows file systems differ substantially from Linux file systems, and this post looks into how WSL bridges those two worlds.
File systems on Linux
Linux abstracts file systems operations through the Virtual File System (VFS), which provides both an interface for user mode programs to interact with the file system (through system calls such as open, read, chmod, stat, etc.) and an interface that file systems have to implement. This allows multiple file systems to coexist, providing the same operations and semantics, with VFS giving a single namespace view of all these file systems to the user.
File systems are mounted on different directories in this namespace. For example, on a typical Linux system your hard drive may be mounted at the root, /, with directories such as /dev, /proc, /sys, and /mnt/cdrom all mounting different file systems which may be on different devices. Examples of file systems used on Linux include ext4, rfs, FAT, and others.
VFS implements the various system calls for file system operations by using a number of data structures such as inodes, directory entries and files, and related callbacks that file systems must implement.
The inode is the central data structure used in VFS. It represents a file system object such as a regular file, directory, symbolic link, etc. An inode contains information about the file type, size, permissions, last modified time, and other attributes. For many common Linux disk file systems such as ext4, the on-disk data structures used to represent file metadata directly correspond to the inode structure used by the Linux kernel.
While an inode represents a file, it does not represent a file name. A single file may have multiple names, or hard links, but only one inode.
File systems provide a lookup callback to VFS which is used to retrieve an inode for a particular file, based on the parent inode and the child name. File systems must implement a number of other inode operations such as chmod, stat, open, etc.
VFS uses a directory entry cache to represent your file system namespace. Directory entries only exist in memory, and contain a pointer to the inode for the file. For example, if you have a path like /home/user/foo, there is a directory entry for home, user, and foo, each with a pointer to an inode. Directory entries are cached for fast lookup, but if an entry is not yet in the cache, the inode lookup operation is used to retrieve the inode from the file system so a new directory entry can be created.
When an inode is opened, a file object is created for that file which keeps track of things like the file offset and whether the file was opened for read, write or both. File systems must provide a number of file operations such as read, write, sync, etc.
Applications refer to file objects through file descriptors. These are numeric values, unique to a process, that refer to any files the process has open. File descriptors can refer to other types of objects that provide a file-like interface in Linux, including ttys, sockets, and pipes. Multiple file descriptors can refer to the same file object, e.g. through use of the dup system call.
Special file types
Besides just regular files and directories, Linux supports a number of additional file types. These include device files, FIFOs, sockets, and symbolic links.
Some of these files affect how paths are parsed. Symbolic links are special files that refer to a different file or directory, and following them is handled seamlessly by VFS. If you open the path /foo/bar/baz and bar is a symbolic link to /zed, then you will actually open /zed/baz instead.
Similarly, a directory may be used as a mount point for another file system. In this case, when a path crosses this directory, all inode operations below the mount point go to the new file system.
Special and pseudo file systems
Linux uses a number of file systems that don’t read files from a disk. TmpFs is used as a temporary, in-memory file system, whose contents will not be persisted. ProcFs and SysFs both provide access to kernel information about processes, devices and drivers. These file systems do not have a disk, network or other device associated with them, and instead are virtualized by the kernel.
File systems on Windows
Windows generalizes all system resources into objects. These include not just files, but also things like threads, shared memory sections, and timers, just to name a few. All requests to open a file ultimately go through the Object Manager in the NT kernel, which routes the request through the I/O Manager to the correct file system driver. The interface that file system drivers implement in Windows is more generic and enforces fewer requirements. For example, there is no common inode structure or anything similar, nor is there a directory entry; instead, file system drivers such as ntfs.sys are responsible for resolving paths and opening file objects.
File systems in Windows are typically mounted on drive letters like C:, D:, etc., although they can be mounted on directories in other file systems as well. These drive letters are actually a construct of Win32, and not something that the Object Manager directly deals with. The Object Manager keeps a namespace that looks similar to the Linux file system namespace, rooted in \, with file system volumes represented by device objects with paths like \Device\HarddiskVolume1.
When you open a file using a path like C:\foo\bar, the Win32 CreateFile call translates this to an NT path of the form \DosDevice\C:\foo\bar, where \DosDevice\C: is actually a symbolic link to, for example, \Device\HarddiskVolume4. Therefore, the real full path to the file is actually \Device\HarddiskVolume4\foo\bar. The object manager resolves each component of the path, similar to how VFS would in Linux, until it encounters the device object. At this point, it forwards the request to the I/O manager, which creates an I/O Request Packet (IRP) with the remaining path, which it sends to the file system driver for the device.
When a file is opened, the object manager creates a file object for it. Instead of file descriptors, the object manager provides handles to file objects. Handles can actually refer to any object manager object, not just files.
When you call a system call like NtReadFile (typically through the Win32 ReadFile function), the I/O manager again creates an IRP to send down to the file system driver for the file object to perform the request.
Because there are no inodes or anything similar in NT, most operations on files in Windows require a file object.
Windows only supports two file types: regular files and directories. Both files and directories can be reparse points, which are special files that have a fixed header and a block of arbitrary data. The header includes a tag that identifies the type of reparse point, which must be handled by a file system filter driver, or for built-in reparse point types, the I/O manager itself.
Reparse points are used to implement symbolic links and mount points. In these cases, the tag indicates that the reparse point is a symbolic link or mount, and the data associated with the reparse point contains the link target, or volume name for mount points. Reparse points can also be used for other functionality such as the placeholder files used by OneDrive in Windows 8.
Unlike Linux, Windows file systems are by default case preserving, but not case sensitive. In actuality, Windows and NTFS do support case sensitivity, but this behavior is not enabled by default.
File systems in WSL
The Windows Subsystem for Linux must translate various Linux file system operations into NT kernel operations. WSL must provide a place where Linux system files can exist, with all the functionality required for that including Linux permissions, symbolic links and other special files such as FIFOs; it must provide access to the Windows volumes on your system; and it must provide special file systems such as ProcFs.
To facilitate this, WSL has a VFS component that is modeled after the VFS on Linux. The overall architecture is shown below.
When an application calls a system call, this is handled by the system call layer, which defines the various kernel entry points such as open, read, chmod, stat, etc. For these file-related system calls, the system call layer has very little functionality; it basically just forwards the call to VFS.
For operations that use paths (such as open or stat), VFS resolves the path using a directory entry cache. If an entry is not in the cache, it calls into one of several file system plugins to create an inode for the entry. These plugins provide inode operations like lookup, chmod, and others, similar to the inode operations used by the Linux kernel. When a file is opened, VFS uses the file system’s inode open operation to create a file object, and returns a file descriptor for that file object. System calls operating on the file descriptor (such as read, write or sync) call file operations defined by the file systems. This system is deliberately very close to how Linux behaves, so WSL can support the same semantics.
VFS defines several file system plugins: VolFs and DrvFs are used to represent files on disk, and the remainder are the in-memory file system TmpFs and pseudo file systems such as ProcFs, SysFs, and CgroupFs.
VolFs and DrvFs are where Linux file systems meet Windows file systems. They are how WSL interacts with files on your disks, and serve two different purposes: VolFs is designed to provide full support for Linux file system features, and DrvFs is designed for interop with Windows.
Let’s look at these file systems in more detail.
The primary file system used by WSL is VolFs. It is used to store the Linux system files, as well as the content of your Linux home directory. As such, VolFs supports most features the Linux VFS provides, including Linux permissions, symbolic links, FIFOs, sockets, and device files.
VolFs is used to mount the VFS root directory, using %LocalAppData%\lxss\rootfs as the backing storage. In addition, a few additional VolFs mount points exist, most notably /root and /home which are mounted using %LocalAppData%\lxss\root and %LocalAppData%\lxss\home respectively. The reason for these separate mounts is that when you uninstall WSL, the home directories are not removed by default, so any personal files stored there will be preserved.
Note that all these mount points use directories in your Windows user folder for storage. Each Windows user has their own WSL environment, and can therefore have Linux root privileges and install applications without affecting other Windows users.
Inodes and file objects
Since Windows has no related inode concept, VolFs must keep a handle to a Windows file object in an inode. When VFS requests a new inode using the lookup callback, VolFs uses the handle from the parent inode and the name of the child to perform a relative open and get a handle for the new inode. These handles are opened without any read/write access to the files, and can only be used for metadata requests.
When a file is opened, VolFs creates a Linux file object that points to the inode. It also reopens the inode’s file handle with the requested read/write access and stores the new handle in the file object. This handle is then used to satisfy file operations like read and write.
Emulating Linux features
As discussed above, Linux diverges from Windows in several ways for file systems. VolFs must provide support for several Linux features that are not directly supported by Windows.
Case sensitivity is handled by Windows itself. As mentioned earlier, Windows and NTFS actually support case sensitive operations, so VolFs simply requests the Object Manager to treat paths as case sensitive regardless of the global registry key controlling this behavior.
Linux also supports nearly all characters as legal characters in file names. NT has more restrictions, where some characters are not allowed at all and others may have special meanings (such as ‘:’ denoting an alternate data stream). To support all Linux file names, VolFs escapes illegal characters in file names.
Linux has some different semantics surrounding unlinking and renaming. Specifically, a file can be unlinked even if there are open file descriptors to the file. Similarly, a file can be overwritten as the target of a rename operation even if it’s still open. In Windows, if a file is requested to be deleted, it will only be deleted once the last handle to that file is closed, leaving the name visible in the file system until then. To support Linux unlink semantics, VolFs renames unlinked files to a hidden temporary directory before requesting deletion.
Inodes in Linux have a number of attributes which don’t exist in Windows, including their owner and group, the file mode, and others. These attributes are stored in NTFS Extended Attributes associated with the files on disk. The following information is stored in the Extended Attributes:
- Mode: this includes the file type (regular, symlink, FIFO, etc.) and the permission bits for the file.
- Owner: the user ID and group ID of the Linux user and group that own the file.
- Device ID: for device files, the device major and minor number of the device. Note that WSL currently does not allow users to create device files on VolFs.
- File times: the file accessed, modified and changed times on Linux use a different format and granularity than on Windows, so these are also stored in the EAs.
In addition, if a file has any file capabilities, these are stored in an alternate data stream for the file. Note that WSL currently does not allow users to modify file capabilities for a file.
The remaining inode attributes, such as inode number and file size, are derived from information kept by NTFS.
Interoperability with Windows
While VolFs files are stored in regular files on Windows in the directories mentioned above, interoperability with Windows is not supported. If a new file is added to one of these directories from Windows, it lacks the EAs needed by VolFs, so VolFs doesn’t know what to do with the file and simply ignores it. Many editors will also strip the EAs when saving an existing file, again making the file unusable in WSL.
Additionally, since VFS caches directory entries, any modifications to those directories that are made from Windows while WSL is running may not be accurately reflected.
To facilitate interoperability with Windows, WSL uses the DrvFs file system. WSL automatically mounts all fixed drives with supported file systems under /mnt, such as /mnt/c, /mnt/d, etc. Currently, only NTFS and ReFS volumes are supported.
DrvFs operates in a similar fashion as VolFs. When creating inodes and file objects, handles are opened to Windows files. However, in contrast to VolFs, DrvFs adheres to Windows rules (with a few exceptions, noted below). Windows permissions are used, only legal NTFS file names are allowed, and special file types such as FIFOs and sockets are not supported.
Linux usually uses a simple permission model where a file allows read, write or execute access to either the owner of the file, the group, or everyone else. Windows instead uses Access Control Lists (ACLs) that specify complex access rules for each individual file and directory (Linux does also have the ability to use ACLs, but this is not currently supported in WSL).
When opening a file in DrvFs, Windows permissions are used based on the token of the user that executed bash.exe. So in order to access files under C:\Windows, it’s not enough to use “sudo” in your bash environment, which gives you root privileges in WSL but does not alter your Windows user token. Instead, you would have to launch bash.exe elevated to gain the appropriate permissions.
In order to give the user a hint about the permissions they have on files, DrvFs checks the effective permissions a user has on a file and converts those to read/write/execute bits, which can be seen for example when running “ls -l”. However, there is not always a one-to-one mapping; for example, Windows has separate permissions for the ability to create files or subdirectories in a directory. If the user has either of these permissions, DrvFs will report write access on the directory, while in fact some operations may still fail with access denied.
Since your effective access to a file may differ depending on whether bash.exe was launched elevated or not, the file permissions shown in DrvFs will also change when switching between elevated and non-elevated bash instances.
When calculating the effective access to a file, DrvFs takes the read-only attribute into account. A file with the read-only attribute set in Windows will show up in WSL as not having write permissions. Chmod can be used to set the read-only attribute (by removing all write permissions, e.g. “chmod a-w some_file”) or clear it (by adding any write permissions, e.g. “chmod u+w some_file”). This behavior is similar to the CIFS file system in Linux, which is used to access Windows SMB shares.
Since the support is there in Windows and NTFS, DrvFs supports case sensitive files. This means it’s possible to create two files whose name only differs by case in DrvFs. Note that many Windows applications may not be able to handle this situation, and may not be able to open one or both of the files.
Case sensitivity is disabled on the root of your volumes, but is enabled everywhere else. So in order to use case sensitive files, do not attempt to create them under /mnt/c, but instead create a directory where you can create the files.
While NT supports symbolic links, we could not rely on this support because symbolic links created by WSL may point to paths like /proc which have no meaning in Windows. Additionally, NT requires administrator privileges to create symbolic links. So, another solution had to be found.
Unlike VolFs, we could not rely on EAs to indicate a file is a symbolic link in DrvFs. Instead, WSL uses a new type of reparse point to represent symbolic links. As a result, these links will work only inside WSL and cannot be resolved by other Windows components such as File Explorer or cmd.exe. Note that since ReFS lacks support for reparse points, it also doesn’t support symbolic links in WSL. NTFS however now has full symbolic link support in WSL.
Interoperability with Windows
Unlike VolFs, DrvFs does not store any additional information. Instead, all inode attributes are derived from information used in NT, by querying file attributes, effective permissions, and other information. DrvFs also disables directory entry caching to ensure it always presents the correct, up-to-date information even if a Windows process has modified the contents of a directory. As such, there is no restriction on what Windows processes can do with the files while DrvFs is operating on them.
DrvFs also uses Windows delete semantics for files, so a file cannot be unlinked if there are any open file descriptors (or handles from Windows processes) to the file.
ProcFs and SysFs
Like in Linux, these special file systems do not show files that exist on disk, but instead represent information kept by the kernel about processes, threads, and devices. These files are dynamically generated when read. In some cases, the information for the files is kept entirely inside the lxcore.sys driver. In other cases, such as the CPU usage of a process, WSL queries the NT kernel for this information. However, there is no interaction here with Windows file systems.
WSL provides access to Windows files by emulating full Linux behavior for the internal Linux file system with VolFs, and by providing full access to Windows drives and files through DrvFs. As of this writing, DrvFs enables some of the functionality of Linux file systems, such as case sensitivity and symbolic links, while still supporting interoperability with Windows.
In the future, we will continue to improve our support for Linux file system features, not only in VolFs but also in DrvFs. The goal is to reduce the number of scenarios that require you to stay in the VolFs mounts with all the limitations on interoperability that entails. These improvements are driven by the great feedback we get from the community on GitHub and User Voice to help us target the most important scenarios.
Sven Groot and Seth Juarez explore WSL file system support.