Windows and Ubuntu Interoperability

Article
10/19/2016

This is the fifth in a series of blog posts on the Windows Subsystem for Linux (WSL). For background information you may want to read the architectural overview, introduction to pico processes, WSL system calls, and WSL file system blog posts.

Seth Juarez and Ben Hillis talk about Windows and Ubuntu interoperability

Introduction

This post will cover the design and implementation of the top requested feature for the Windows Subsystem for Linux - interoperability with Win32 applications. Since the earliest days that our feature was available to Windows Insiders, our users have been crucial in pointing out issues and suggesting key scenarios that they wish to leverage using WSL. It has been a gratifying and humbling experience to have so many people actively using WSL and providing feedback. We appreciate all the bug reports and suggestions; they have been an invaluable tool to prioritize features for future versions of Windows. It did not take long for one such feature to bubble to the top - Interoperability with Win32 applications.

First, let’s define what is meant by interoperability. In a broad sense, interoperability is the ability to mix and match NT and Linux binaries from within the same shell. For example, using bash.exe to navigate your file system and launching an NT graphical text editor from the current working directory. Another aspect is the ability for NT and Linux binaries to redirect their input and output. For example, filtering the output of a Windows command with the Linux grep binary. This functionality allows the Windows Subsystem for Linux to accomplish something that was previously not possible: seamlessly running your favorite native Windows and Linux tools from the shell of your choice. All of this functionality is available with Windows build 14951 and later.

Launching Processes in WSL

The following diagram describes how the various pieces fit together. When launching a process, there are a few pieces of state that need to be shared. The first is the current working directory. Second are the file objects that represent StandardInput, StandardOutput, and StandardError so output from the new process goes to the right place. For example, piping output to another process, redirecting output to a file, or making sure that the contents show up in the right console environment. The following sections will go into more details how this sharing is accomplished.

Launching /bin/bash

bash.exe uses the internal CreateLxProcess COM API to talk to the LXSS Manager service.
The LXSS Manager service translates the current working directory to a WSL path and marshals NT handles provided by bash.exe representing StandardInput, StandardOutput, StandardError, and the console with LxCore.sys.
The LXSS Manager sends a message via LxBus to our custom /init daemon.
The init daemon parses the message and unmarshals StandardInput, StandardOutput, StandardError, and the console; setting them to file descriptors 0, 1, and 2 and creating a tty file descriptor.
/init forks and execs the /bin/bash binary and goes back to listen for the next create process message.

Launching an NT binary from within a WSL instance

/bin/bash forks and execs an NT binary. LxCore.sys finds a binfmt_misc registration that handles Windows PE images and invokes the interpreter (/init).
/init (running as the binfmt_misc interpreter) translates the current working directory and marshals file descriptors 0, 1, and 2 (representing stdin, stdout, and stderr) with LxCore.sys.
/init sends a create NT process message to bash.exe.
Bash.exe unmarshals the file descriptors, which creates NT handles, and calls CreateProcess specifying the NT handles as StandardInput, StandardOutput, and StandardError.

Under the covers – LxBus

To discuss the mechanism by which some of the higher-level goals are accomplished, we will need some details on WSL’s inter-process communication mechanism, LxBus. LxBus is the communication channel between NT and WSL processes (primarily the LXSS Manager service and /init). It is a socket like protocol that also allows sending messages and sharing state between NT and WSL processes. LxBus has a lot of functionality and will likely be the topic of a future blog post, but let’s talk about a few of the things it can do.

Inside of a WSL instance, LxBus is accessed by opening the /dev/lxss device. This device is locked down to the /init daemon. There is also a /dev/lxssclient device which supports a subset of the functionality of the /dev/lxss device. On Windows, LxBus is accessed via the handle that represents the running WSL Instance. Once a file descriptor or handle to one of these devices is opened it supports several ioctls. The most relevant to our discussion are the “register server” and “connect to server” ioctls. A NT or WSL process that wishes to accept connections uses the “register server” ioctl to register a named LxBus server with the LxCore.sys driver. When the registration completes, a file descriptor or handle representing the server is returned to the caller. We call this object a LxBus ServerPort. You can think of the register server ioctl as the bind call in socket terminology. Once the server is registered, ServerPort objects support a “wait for connection” ioctl which functions similarly to a socket listen call.

When a client wishes to connect to a registered server, it uses the “connect to server” ioctl which specifies the name of the server it wishes to connect. If the server is waiting for a connection, a connection is established between the NT and WSL process. The NT process receives a handle that represents the connection and the WSL process receives a file descriptor. These reference an underlying file object called a LxBus MessagePort. MessagePorts support read and write for message passing. They also implement several ioctls that can accomplish various goals, primarily involving sharing resources between NT and WSL processes. For example: One ioctl allows a NT process to share a handle with a WSL process. It is this ability that allows us to redirect the output of a WSL process back to Windows.

IO Redirection from bash.exe

Initially when we released WSL, bash.exe only supported console input and output. This was fine if you launched bash.exe and treated it like an ssh connection to a remote machine, but quickly became inadequate if you wanted to incorporate WSL into scripts or Win32 applications. Our users were quick to point this out, in fact the #2 issue on our GitHub reports this limitation. Coming soon to Windows Insiders, you'll be able pipe to and from WSL processes as well as use NT files as input or output.
When launching a WSL process via the LxssManager’s COM interface, there are a variety of arguments. There are the obvious arguments like the path to the binary to be executed and the current working directory. Other arguments include the NT handles to be used as StandardInput, StandardOutput, and StandardError for the WSL process. These handles are passed up to the Lxss Manager service to be shared with the WSL process that is being created. We call the sharing of these handles marshalling. Marshalling is implemented as an ioctl on a LxBus MessagePort object that takes in a handle and returns a unique identifier. This identifier is then sent to the /init daemon along with the other create process parameters. Init parses this message and begins “unmarshalling” the handles so they can be used as input handles for the new WSL process. Unmarshaling is implemented as another MessagePort ioctl that takes the identifier and returns a file descriptor. Using this ioctl, /init unmarshals the three handles and calls the dup2 system call to replace file descriptors 0, 1, and 2. There is a fourth handle that is also marshalled, the console handle. This is used to create a tty device inside of a WSL instance. All the WSL processes spawned from a single bash.exe window will have the same tty device. Once all the handles are unmarshaled, /init uses the fork and exec system calls to create a child process.

If you're familiar with bash, you probably know about the -c argument which allows you to supply a command string to run (instead of running bash in interactive mode). Bash.exe sends the command line arguments along to /bin/bash so invoking commands can be done in a similar way.

Launching Win32 applications from within WSL

At the time of posting the most recent Windows Insider build now supports the ability to launch Win32 binaries from within the bash shell! This functionality was so highly desired that Xilun, one of our amazing GitHub users, created his own open source project called cbwin to accomplish this goal. His project works well, but since the WSL team own the entire pipeline we've come up with an integrated solution that doesn't require a launcher binary inside WSL or an additional Windows service. Turns out getting rid of the launcher binary is something that native Linux is also able to do.

Linux has a capability called binfmt_misc, that allows user-mode applications to register executable file formats that they can handle with the kernel. Java, Mono, and Wine are a few examples of applications that use this functionality to register file types with the Linux kernel. A registration string is colon-separated and has seven fields, some of them optional. To register a binfmt_misc interpreter a user-mode application writes registration string to the binfmt_misc register file (which is normally located in /proc/sys/fs/binfmt_misc/register). For example:

sudo echo ":WSLInterop:M::MZ::/init:" > /proc/sys/fs/binfmt_misc/register

WSLInterop - name of the registration. This must be unique.
M - type of registration (either Magic Byte Array or Extension).
MZ - Magic byte sequence for Windows PE images.
/init - Full path to the interpreter.

An interpreter is a script or binary on the system that knows how to handle the specified type of files. The exec system iterates over the list of registered binfmt_misc interpreters and looks for one that matches the file being executed. if a match is found the exec system call essentially puts the path to the interpreter in Argv[0] and shifts the rest of the arguments to the right by one (Argv[0] becomes Argv[1] and so on). If you're familiar with the "shebang" (#!) operator that you often see at the beginning of bash or Perl scripts this will probably sound familiar to you. Binfmt_misc is a more flexible way to accomplish the same goal without the limitation of having to add extra text to the header of a file, which makes it perfect for binaries.

As you noticed in the above example, we're using /init as our binfmt_misc interpreter. In WSL, /init is a multi-purpose binary that was written by Microsoft and is shipped as a binary resource contained in LxssManager.dll. When /init is launched, it first checks its PID. If the PID is 1, /init will run in "daemon mode" where it essentially the Lxss Manger service’s endpoint inside a WSL instance. If the PID is not 1, /init runs as "binfmt_misc interpreter mode" which allows launching NT binaries. In the future if we begin running other Ubuntu daemons we will likely switch our daemon from replacing /sbin/init to simply being another daemon on the system.

A couple of notes and caveats

NT processes will be launched with the same permission as the bash.exe window they are launched from.
NT binaries can only run out of DrvFs paths (for example: /mnt/c/Windows/System32).
Current working directory of launched NT processes will be inherited by NT processes if it is somewhere inside a DrvFs mount. Otherwise the NT process will inherit the current working directory that bash.exe was launched from.
In our initial implementation, NT environment variables and WSL environment variables are two disparate entities. This includes the $PATH environment variable. In development is a way to sharing of the user’s NT %PATH% environment variable with WSL processes – so stay tuned!
One of the first NT commands you'd probably think of trying is the Windows 'dir' command. Unlike Linux where '/bin/ls' is an actual binary, dir is a built-in command. So, if you want to run dir you'll have to run "/mnt/c/Windows/System32/cmd.exe /c dir". That's kind of a mouthful, but luckily bash allows you to create command aliases, for example:
alias dir='/mnt/c/Windows/System32/cmd.exe /c dir'
/init being used as the binfmt_misc interpreter is currently how things are implemented but this is subject to change.

IO Redirection to Win32

Launching NT processes from within bash is great, but in order to support NT command-line utilities there needs to be a way for NT binaries to access the file descriptors that represent stdin, stdout, and stderr inside a WSL process. This was achieved by introducing a set of API’s to marshal VFS file to an NT process. Let's walk through how this works.

If you're familiar with the CreateProcess API, you probably know that you can provide custom handles for StandardInput, StandardOutput, and StandardError. The problem is, we don't have a handle, we have a file descriptor inside of a WSL instance. Enter VFS file marshalling. A WSL process (the binfmt_misc interpreter) decides that it wants to marshal the stdin, stdout, and stderr file descriptors over to an NT process. It uses a special "marshal VFS file" ioctl on a LxBus MessagePort which takes in a file descriptor as an argument. The LxCore.sys driver handles this ioctl and uses the provided file descriptor to reference the underlying VFS file object. This object is then added onto a per MessagePort list of "marshalled VFS file objects". The ioctl call returns a unique token that identifies the VFS file object that was marshalled. The WSL process then sends this identifier over the MessagePort to the NT process vi an LxBus message.
When the NT process (bash.exe) receives this message, it uses the "unmarshal VFS file" ioctl which takes the identifier as an argument. LxCore.sys looks up this identifier in the list of marshalled file objects, and if found, removes the entry from the list and returns an NT handle to the VFS file object to the caller. This handle can now be used as one of the standard handles for the CreateProcess API. Since the handle references a file object within a WSL instance, rundown protection is acquired for the duration of each read or write operation (to ensure that WSL instance specific state does not get torn down during an NT read or write).

Conclusion

This post has covered Windows Interop; the most highly requested feature of Windows Subsystem. Interoperability between NT and Linux processes is a feature that allows users of Windows Subsystem for Linux to mix and match NT and Linux binaries from within the same shell, all running natively on the NT kernel. We briefly covered LxBus, the underlying IPC mechanism which makes communicating between NT and WSL processes possible. Armed with this knowledge the post then covered how LxBus is used to allow sharing of NT and VFS file objects between NT and WSL processes.

New features like this are greatly influenced by the feedback we get from the community via GitHub and User Voice, so please continue to make your voices heard! We at the Windows Subsystem for Linux team are excited to see how you use this functionality. Thanks for reading.