WSL Networking


Posted on behalf of Sunil Muthuswamy 

Disclaimer

The information presented in this blog post is as per the current design and is subject to change. Reader is expected to be familiar with the Overview of the Windows Subsystem and WSL System Calls blog posts.

Background

In this information age of “roam-anywhere, always connected” devices, networking plays a very important role. Users expect to be able to take advantage of complex capabilities and assume that they will be present at any point in time. This places a large burden on the networking stack to allow Internet access, data exchange, and state information to be communicated between distinct processes that live on the same device or a device across the world.

Overview

This post discusses networking within WSL. It explains how WSL configures networking within the subsystem, how it keeps the information up to date with any changes and the implementation of the various Linux\BSD socket address families. The design decisions were made with the specific goal of providing full binary compatibility1 to Linux applications and to present the same networking view to Linux administrators (and users alike) within the WSL, as they are accustomed to.

Introduction

There are few key networking concepts involved within the operating systems2, that are covered in this post; network interfaces, Domain Name Resolution Service (DNS) and sockets.

Network interfaces can be thought of as an identifiable gateway that interconnects two or more endpoints with an established interface for communication. Network resources over the internet are usually identified using a domain name, such as www.microsoft.com. Though these names are more human readable, they are not well understood by the network systems. Under the hood, network identities have a different address format. For INET domains, an endpoint is identified using an IP address. DNS service provides the mapping between the more human-readable address format to the machine understood IP address. Without proper DNS resolution, Internet traffic would not reach anywhere. Sockets are set of APIs to communicate between two endpoints, which can be thought of as a conduit for communication.

Networking in Linux

This section will discuss how the various networking concepts that were discussed in the Introduction, applies and is available in Linux.

Network interfaces

The list of network interfaces that are available on a particular system can be accessed via couple of different ways in Linux, both through syscalls. The outdated, but, yet fully supported way is to use Socket IOCTL’s such as SIOCGIFNAME etc. The newer and recommended way that provides the same view, is through the NETLINK sockets, using the NETLINK_ROUTE family. All of the pertaining information about the various network interfaces is kept by the kernel and made available through the above syscalls.

Domain Name Resolution (DNS)

DNS is supported in Linux with the help of the resolver services configured using /etc/resolv.conf in combination with /etc/hostsThe hosts file contains a static list or map of hostnames to their corresponding IP address. While resolv.conf contains (amongst other things) a list of Domain Nameserver(s) that are capable of resolving a given hostname to its IP address. The resolver API’s encapsulates all of this and provides DNS service to the applications.

Sockets

Sockets is an API interface that allows inter process communication (IPC). The two endpoints that want to communicate with each other, open a socket at their end. Once a socket is bound to a given address, other sockets can discover it, depending on the scope of the address. For connection-less protocols such as datagram, this is sufficient for sending and receiving data. For connection-oriented sockets such as stream, a connection needs to be first established between the two peers, before data can be sent or received.

Opening a socket requires the caller to provide the address family “AF” (also referred to as protocol family ‘PF’ or domain), the type (ex: datagram, stream etc.) and the protocol. Sockets are broadly categorized depending on their domain names. The type (and the protocol) helps identify sub-category of the socket. The most commonly used domains by *NIX applications is the AF_INET, AF_UNIX (also known as AF_LOCAL) and AF_NETLINK. AF_INET provides access to the Internet protocol. AF_UNIX is used for communicating between processes that live on the same system3. And, lastly, AF_NETLINK sockets are used for communicating between the user mode and the kernel. They can also be used for IPC between two user mode processes.

Networking in WSL

This section provides an overview of how the above Linux networking concepts are implemented within WSL for a full binary and interface compatibility.

Network interfaces and DNS

When the first instance of bash.exe is launched, the LXSS session manager service4 queries the list of network interfaces and the DNS servers from Windows. Using the LXBus, the service passes this information to the Linux Subsystem driver. The driver caches all the information about the various network interfaces locally. As for the DNS entries, the Linux Subsystem driver populates the resolv.conf with the list. The cached network interface information kept by the Linux Subsystem driver is accessible through the aforementioned socket IOCTL’s. During the first launch of bash.exe, it will also auto-generate5 the hosts file for the particular system.
The LXSS session manager service also registers with Windows for notifications for any updates to the network interface (for example, moving from wireless to wired Ethernet), or to the DNS entries. Windows will notify the service of any change to the monitored network information, by calling the registered callback handler. When notified of any such change, the callback handler (part of the service), will use the same information flow mentioned above to notify the Linux Subsystem driver, which will update resolv.conf with the updated DNS entries. This way WSL can always keep the network components within the system up to date and in sync with Windows.

Sockets

Currently, WSL provides implementation for the AF_INET, AF_UNIX and AF_NETLINK address families. All of the socket implementation in WSL is provided in the kernel mode, from within the Linux Subsystem driver. This is essentially because all of the BSD socket API’s map one-to-one directly to syscalls6.

AF_INET (Internet domain)

As discussed previously, socket created with the AF_INET domain provide access to the information and services hosted on the Internet. Within the INET domain, there are few different supported socket types; namely the DGRAM, STREAM and RAW. DGRAM sockets, or, more commonly referred to as UDP sockets are connection-less sockets, with no reliability guarantees. STREAM sockets, or, more commonly referred to as TCP sockets are connection-oriented sockets, with some reliability and ordering guarantees provided by the protocol. RAW sockets support many different protocols, such as the ICMP (used by ping), and RAW which allows the protocol to be implemented entirely in the user space.
Win32 has user-mode adaptation of the BSD sockets called Winsock. It would seem pretty straight forward to leverage Winsock for providing the socket implementation within WSL. But, as mentioned previously, all of the WSL socket implementation is in the kernel mode, in the WSL socket library (WslSocket.lib) (which is part of the Linux Subsystem driver). That rules out the possibility of using Winsock. Fortunately, NT has a kernel mode network programming interface, called as the Winsock Kernel (or more commonly, as WSK).

WSK

WSK is a publicly documented API set. It is a very thin layered, low-level NT API set that provides fast and easy access to the data from the TCP/IP driver, with little or no overhead of its own. The WSK API interface differs significantly from the BSD API’s, even though it uses the same underlying constructs such as “sockets”. As for example, WSK does not provide any buffering of data7. Another example would be the differences in the operating model of WSK and BSD sockets. WSK supports both synchronous and asynchronous modes of retrieving data from the TCP/IP driver, with the asynchronous mode being more efficient. BSD sockets also support synchronous and asynchronous I/O (though epoll), but their semantics are vastly different. WSL (*not* WSK) provides support for the BSD socket API’s by translating them to the WSK API’s and wherever needed (such as for data buffering) bridging the differences by having the necessary infrastructure and implementation within WSL. (see Figure 1 AF_INET sockets in WSL).
The following sections will go into the details of each of the individual socket types such as TCP, UDP, and, explain how WSL implements them underneath.

new_networking_fig_1

Figure 1. AF_INET sockets in WSL

DGRAM or UDP socket type

As mentioned previously, UDP sockets are light weight, connection-less sockets, with no delivery guarantees. Once a socket has been created, it can be used to send data immediately. Receiving data requires other sockets to be able to identify it using its address. Once the socket binds to an address, then other sockets can send data to it without any further delay. Any UDP socket can send data to any UDP socket, as long it is able to identify (or locate) it using its address.

networking_fig_2

Figure 2. WSL INET UDP file context

When a WSL UDP application creates a BSD socket using the ‘socket’ syscall, the ‘WSL socket library’ creates a context (see Figure 2 WSL INET UDP file context) and attaches it to the file descriptor (using the VFS file object) that is returned by the syscall. As part of the same ‘socket’ syscall, the driver also creates a WSK UDP socket and stores it in the context. All further operations by the user-mode app on that socket will allow the driver to extract the associated context, and, the WSk socket within. Any data sent over the BSD UDP socket, is sent directly over on the WSK socket.

When the user mode application binds the BSD socket, the ‘WSL socket library’ binds the corresponding WSK socket and registers a WSK ‘receive from’ callback handler with WSK. Any time, data is available on the WSK socket to be read, WSK will call the registered handler/routine. The handler stores the data provided by the WSK (from TCP/IP) in the receive buffer, along with some metadata such as the size of the data (packet), ‘receive from’ address etc. When the application calls ‘recvfrom’ on the BSD socket, the socket library is able to satisfy the request using the data from the ‘receive buffer’.

STREAM or TCP socket type

As mentioned previously, TCP sockets are connection-oriented sockets, with some delivery guarantees. A connection needs to be first established between the two sockets, usually referred to as the ‘client’ and ‘server’ socket. The ‘server’ socket, binds to a well-known address and listens for incoming connection(s) using the ‘listen’ socket call. At this point, the ‘server’ socket is capable of accepting connections and clients can connect to it, using the ‘connect’ socket call. Once the connection is established, the server socket can then be used to accept connections using the ‘accept’ socket call. The accept call returns a new socket, that is connected to the client socket and can be used to send/receive data to/from the client socket, and vice versa. The server socket is then free to accept more incoming connections on the original server socket.

The mechanism with which the ‘WSL socket library’ creates and manages TCP BSD sockets is very similar to that of the UDP BSD sockets, but is more involved. When a WSL TCP application creates a BSD TCP socket using the ‘socket’ syscall, the ‘WSL socket library’ creates a context (see Figure 3 WSL INET TCP file context) and attaches it to the file descriptor (using the VFS file object) that is returned by the syscall. As part of the same ‘socket’ syscall, the driver also creates a WSK TCP socket and stores it in the context. The one noticeable difference between how the ‘WSL socket library’ handles UDP and TCP sockets, is the point at which it registers for WSK callbacks. As we saw earlier, in the case of UDP, the callback was registered during bind. For TCP sockets, the WSK callbacks are not registered up until later because of the need to know whether the TCP socket will be used for accepting connections or send/receive (a TCP socket can only be either of the two). When the application calls the ‘listen’ socket call, the ‘WSL socket library’ registers its ‘WSK accept’ callback handler with WSK. For the send/recv socket, whenever an application calls ‘connect’, or, on all accepted sockets, the ‘WSL socket library’ will register the ‘WSK receive’ and ‘WSK disconnect’ callback handlers. With all the right callback handlers registered with WSK, the ‘WSL socket library’ is well equipped to deal with any event that is related to that socket.

 

networking_fig_3

Figure 3. WSL INET TCP file context

 

In the case of the ‘listening’ socket, whenever there is an incoming connection request on a socket, WSK will call the appropriate ‘WSK accept’ handler registered for that socket. The ‘WSL socket library’ can then decide whether to accept or reject that connection, based on the already accepted connection list. In the case where the connection is accepted, the accepted WSK socket is stored in the list of accepted sockets. Whenever the application calls ‘accept’, the ‘WSL socket library’ finds the next connection from the list, creates a new ‘WSL TCP file context’, stores the corresponding WSK socket within it, and, returns a new socket file descriptor to the user. The new file descriptor that is returned can be used for send/recv.
For data transfer, once the connection is established, TCP sockets can send and receive data. In the case of send, when the WSL TCP application requests data to be sent over the socket, the ‘WSK socket library’ will queue the data in the ‘send buffer’ and log a pending request with WSK to send the data and return immediately, so that the send can complete asynchronously. The case of receive is similar to that of UDP. WSK will call the registered ‘WSK receive’ callback, whenever there is data to available on that socket. The ‘WSK socket library’ will buffer the incoming data in the internal ‘receive buffer’, and that data is now available to the user-mode TCP application. The application can receive the data using the ‘recv/recvmsg/read’ socket call.

RAW socket type

The case of ‘RAW’ socket is very similar to that of ‘UDP’ socket (see DGRAM or UDP socket type), because of the similarities in the data transfer protocol. The one major difference is that the underlying WSK socket type that is stored in the associated ‘file context’ is of the type, ‘RAW’. Currently, WSK only supports RAW sockets of the ‘ICMP’ protocol.

AF_UNIX or AF_LOCAL

AF_UNIX domain sockets are used for inter-process communication between processes that live within the same domain (or system). The ‘WSL socket library’ provides and manages the implementation for the AF_UNIX domain socket purely and wholly within the subsystem, without any involvement from WSK (see Figure 4 AF_UNIX sockets in WSL)

new_networking_fig_4

Figure 4 AF_UNIX sockets in WSL

AF_NETLINK

AF_NETLINK sockets can be used for communication between the kernel and user-mode processes, and also between multiple user-mode processes. AF_NETLINK supports multiple protocols for kernel and user-mode communication, including NETLINK_ROUTE, NETLINK_FIREWALL and NETLINK_KOBJECT_UEVENT. WSL has implemented support for user-mode calls to the kernel in the NETLINK_ROUTE protocol, which handles the querying and configuration of network parameters such as network interfaces, IP addresses and routing tables. The decision to prioritize NETLINK_ROUTE was based on targeted telemetry gathered from WSL users, which showed that NETLINK_ROUTE was the most commonly used Netlink protocol among the applications executed in WSL. Examples of Linux applications using NETLINK_ROUTE include ip, traceroute and whois.

Support for AF_NETLINK sockets is implemented inside the Linux Subsystem driver. Specifically, the NETLINK_ROUTE protocol is implemented by calling the Windows NETIO APIs and translating the information provided by the NETIO APIs into the format expected by the NETLINK_ROUTE messages. The following table describes which NETLINK_ROUTE message types are currently supported in WSL, as well as the equivalent NETIO API used in the Linux Subsystem driver to implement its behavior, and the typical Linux user-mode utility usage. We are actively working to expand this table to more NETLINK_ROUTE message types.

NETLINK_ROUTE Message Type Windows NETIO API Linux User-mode Usage Example
RTM_GETLINK GetIfTable2() ip link
RTM_GETADDR GetUnicastIpAddressTable() ip addr show
RTM_GETROUTE* GetIpForwardTable2() ip route show

*Coming soon

Socket options

Linux provides plenty of socket options that are available to the user application through the set/getsockopt syscalls.

AF_INET socket options

The INET socket options are layered at different levels, following the OSI networking model. This allows the application to apply the socket option at different layer, such as the TCP (or UDP), IP or at the socket layer, providing high level of control to the application. WSL manages this by assigning clear ownership of the socket options. Some of the socket options such as the send/receive buffer sizes (SO_SNDBUF/RCVBUF), send/receive timeouts (SO_SNDTIMEO /SO_RCVTIMEO) are fully managed by the “WSK socket library”. Most other socket options, including all of the TCP/IP socket options are applied to the WSK socket, using the WskControlSocket API, and the “WSK socket library” merely acts as a pass through for those options.

AF_UNIX socket options

All of the AF_UNIX socket options are fully managed by the “WSL socket library”.

Credits

Many thanks to the Windows networking team for implementing the necessary features required to support networking in WSL, and for their guidance.

 

Footnotes

[1] From a networking perspective

[2] In reference here to the Linux and Windows OS.

[3] or in the same network namespace

[4] Refer to the “WSL Components” diagram in the “Windows Subsystem for Linux Overview” blog post

[5] Can be disabled by the user.

[6] For details on how the syscall redirection works, refer to the blog post on ‘WSL System Calls’

[7] Which should not be confused with Winsock.

 

Sunil Muthuswamy and Seth Juarez explore networking on WSL


Comments (7)

  1. Zephya says:

    Thanks – interesting article.

    Are you intending to permit sysconfig/sysctl settings to be made natively inside wsl ?
    eg sysctl networking parameters / network scripts / configuration of nics / ethn devices / iptables firewall rules ?

    Please can you describe how WSL firewall rules/selinux contexts will be handled/ integrated (or not) with AV/Windows firewalls ?

    Would I be correct in thinking ipv6 will be supported in addition to IPv4?

    If so, can we prevent inexperienced linux users from inadvertently and unknowingly exposing their machines and data to risk/linux exploits ? ( eg by simply not knowing they need to configure iptables and iptables6 …. or making the attempt but failing to lock wsl down enough, misconfiguring items, disabling selinux and/or following bad advice from the internet )

    1. Sunil Muthuswamy says:

      @Zephya – Really good questions. Yes, we are planning on permitting syscalls (including sysconfig/sysctl) to allow network enumeration\configuration such as adding links, addresses and interfaces. Some of that is already available in the latest insider build, more is coming. Supporting firewall through WSL is not currently our priority and we want to continue to support Windows firewall solutions. As far IPv6 is concerned, it is already supported in the Insider builds.
      The caveat in all of the above is that any operation from bash is tied to the Windows ACL of the bash.exe. For example, if any operation requires Admin privileges on Windows, the same will be applied to the bash.exe.
      Nice suggestion on some documentation around configuration, and we will definitely take that into consideration. A lot of documentation is already present on our MSDN page. But, we will see if there is room for more.

      Hope this helps and thanks for the feedback.

  2. Mikhail Oshibkovich says:

    And thus the title in the RSS reader for any blogpost became “Disclaimer” forever.

  3. Yury says:

    as a network engineer I like to use Linux networking diagnostic tools. Unfortunately, neither one of them works at the moment. Some require IP_MTU_DISCOVER, others RAW sockets support 🙁

    Getting a lot of such errors:
    ping: icmp open socket: Permission denied │
    traceroute: setsockopt IP_MTU_DISCOVER: Invalid argument
    tcptraceroute: libnet_init() failed: libnet_open_raw4(): SOCK_RAW allocation failed: Protocol not supported
    mtr: unable to get raw sockets.

    Is there any plans to support this Linux functionality?

    1. Sunil Muthuswamy says:

      Thanks for the post. Yes, IP_MTU_DISCOVER has been implemented for datagram sockets and support for raw sockets has also been extended. But, these fixes will be available in the next Windows Insider build. They will also be available in the next major release of Windows (a.k.a Creators Update). You can follow the progress on ‘IP_MTU_DISCOVER’ more closely here https://github.com/Microsoft/BashOnWindows/issues/170. Also, you can see the progress on other issues such as raw sockets in our GitHub https://github.com/Microsoft/BashOnWindows/issues. Please head out there and post any issues that you might be facing. There is a lot of support there. We want to unblock and support you as much as we can.

      1. Yury says:

        that is fantastic news, Sunil! Thanks a lot!

  4. tdifilippo says:

    Awesome article. I have been developing network software since the 80’s and this is an excellent discussion and comparison (in some ways) of Linux networking vs. Windows networking. It’s really useful to have the above in mind when developing for DSL/Windows since it may impact a test that I execute on this platform, and I will be interested to look at performance of DSL vs. straight Linux distributions. Thank you for this information….GREAT!!!!!

Skip to main content