WebSockets in Windows Consumer Preview

In Windows 8 Consumer Preview and Server Beta, IE10 and all other Microsoft WebSocket client and server features now support the final version of the IETF WebSocket Protocol. In addition, IE10 implements the W3C WebSocket API Candidate Recommendation.

WebSockets are stable and ready for developers to start creating innovative applications and services. This post provides a simple introduction to the W3C WebSocket API and its underlying WebSocket protocol. The updated Flipbook demo uses the latest version of the API and protocol.

In my previous post, I introduced WebSocket scenarios:

WebSockets enable Web applications to deliver real-time notifications and updates in the browser. Developers have faced problems in working around the limitations in the browser’s original HTTP request-response model, which was not designed for real-time scenarios. WebSockets enable browsers to open a bidirectional, full-duplex communication channel with services. Each side can then use this channel to immediately send data to the other. Now, sites from social networking and games to financial sites can deliver better real-time scenarios, ideally using same markup across different browsers.

Since that September 2011 post, the working groups have made significant progress. The WebSocket protocol is now an IETF proposed standard protocol. In addition, the W3C WebSocket API is a W3C Candidate Recommendation.

Introduction to the WebSocket API Using an Echo Example

The code snippets below use a simple echo server created with ASP.NET’s System.Web.WebSockets namespace to echo back text and binary messages that are sent from the application. The application allows the user to type in text to be sent and echoed back as a text message or draw a picture that can be sent and echoed back as a binary message.

Screen shot of WebSockets echo sample application.

For a more complex example that allows you to experiment with latency and performance differences between WebSockets and HTTP polling, see the Flipbook demo.

Details of Connecting to a WebSocket Server

This simple explanation is based on a direct connection between the application and the server. If a proxy is configured, then IE10 starts the process by sending a HTTP CONNECT request to the proxy.

When a WebSocket object is created, a handshake is exchanged between the client and the server to establish the WebSocket connection.

Diagram illustrating an HTTP GET Upgrade Request from an HTTP Client to an HTTP Server.

IE10 starts the process by sending a HTTP request to the server:

GET /echo HTTP/1.1

Host: example.microsoft.com

Upgrade: websocket

Connection: Upgrade

Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

Origin: https://microsoft.com

Sec-WebSocket-Version: 13

Let’s look at each part of this request.

The connection process starts with a standard HTTP GET request which allows the request to traverse firewalls, proxies, and other intermediaries:

GET /echo HTTP/1.1

Host: example.microsoft.com

The HTTP Upgrade header requests that the server switch the application-layer protocol from HTTP to the WebSocket protocol.

Upgrade: websocket

Connection: Upgrade

The server transforms the value in the Sec-WebSocket-Key header in its response to demonstrate that it understands the WebSocket protocol:

Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

The Origin header is set by IE10 to allow the server to enforce origin-based security.

Origin: https://microsoft.com

The Sec-WebSocket-Version header identifies the requested protocol version. Version 13 is the final version in the IETF proposed standard:

Sec-WebSocket-Version: 13

If the server accepts the request to upgrade the application-layer protocol, it returns a HTTP 101 Switching Protocols response:

Diagram illustrating an HTTP 101 Switching Protocols Respsonse from an HTTP Server Client to an HTTP Client.

HTTP/1.1 101 Switching Protocols

Upgrade: websocket

Connection: Upgrade

To demonstrate that it understands the WebSocket Protocol, the server performs a standardized transformation on the Sec-WebSocket-Key from the client request and returns the results in the Sec-WebSocket-Accept header:

Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

IE10 then compares Sec-WebSocket-Key with Sec-WebSocket-Accept to validate that the server is a WebSocket server and not a HTTP server with delusions of grandeur.

The client handshake established a HTTP-on-TCP connection between IE10 and server. After the server returns its 101 response, the application-layer protocol switches from HTTP to WebSockets which uses the previously established TCP connection.

HTTP is completely out of the picture at this point. Using the lightweight WebSocket wire protocol, messages can now be sent or received by either endpoint at any time.

Diagram illustrating WebSocket messages being sent between two Web sockets.

Programming Connecting to a WebSocket Server

The WebSocket protocol defines two new URI schemes which are similar to the HTTP schemes.

  • "ws:" "//" host [ ":" port ] path [ "?" query ] is modeled on the “http:” scheme. Its default port is 80. It is used for unsecure (unencrypted) connections.
  • "wss:" "//" host [ ":" port ] path [ "?" query ] is modeled on the “https:” scheme. Its default port is 443. It is used for secure connections tunneled over Transport Layer Security.

When proxies or network intermediaries are present, there is a higher probability that secure connections will be successful, as intermediaries are less inclined to attempt to transform secure traffic.

The following code snippet establishes a WebSocket connection:

var host = "ws://example.microsoft.com";

var socket = new WebSocket(host);

ReadyState – Ready … Set … Go …

The WebSocket.readyState attribute represents the state of the connection: CONNECTING, OPEN, CLOSING, or CLOSED. When the WebSocket is first created, the readyState is set to CONNECTING. When the connection is established, the readyState is set to OPEN. If the connection fails to be established, then the readyState is set to CLOSED.

Registering for Open Events

To receive notifications when the connection has been created, the application must register for open events.

socket.onopen = function (openEvent) {

document.getElementById("serverStatus").innerHTML = 'Web Socket State::' + 'OPEN';

};

Details Behind Sending and Receiving Messages

After a successful handshake, the application and the Websocket server may exchange WebSocket messages. A message is composed as a sequence of one or more message fragments or data “frames.”

Each frame includes information such as:

  • Frame length
  • Type of message (binary or text) in the first frame in the message
  • A flag (FIN) indicating whether this is the last frame in the message

Diagram illustrating the contents of WebSocket frames.

IE10 reassembles the frames into a complete message before passing it to the script.

Programming Sending and Receiving Messages

The send API allows applications to send messages to a Websocket server as UTF-8 text, ArrayBuffers, or Blobs.

For example, this snippet retrieves the text entered by the user and sends it to the server as a UTF-8 text message to be echoed back. It verifies that the Websocket is in an OPEN readyState:

function sendTextMessage() {

if (socket.readyState != WebSocket.OPEN)

return;

 

var e = document.getElementById("textmessage");

socket.send(e.value);

}

This snippet retrieves the image drawn by the user in a canvas and sends it to the server as a binary message:

function sendBinaryMessage() {

if (socket.readyState != WebSocket.OPEN)

return;

 

var sourceCanvas = document.getElementById('source');

// msToBlob returns a blob object from a canvas image or drawing

socket.send(sourceCanvas.msToBlob());

// ...

}

Registering for Message Events

To receive messages, the application must register for message events. The event handler receives a MessageEvent which contains the data in MessageEvent.data. Data can be received as text or binary messages.

When a binary message is received, the WebSocket.binaryType attribute controls whether the message data is returned as a Blob or an ArrayBuffer datatype. The attribute can be set to either “blob” or “arraybuffer.”

The examples below use the default value which is “blob.”

This snippet receives the echoed image or text from the websocket server. If the data is a Blob, then an image was returned and is drawn in the destination canvas; otherwise, a UTF-8 text message was returned and is displayed in a text field.

socket.onmessage = function (messageEvent) {

if (messageEvent.data instanceof Blob) {

var destinationCanvas = document.getElementById('destination');

var destinationContext = destinationCanvas.getContext('2d');

var image = new Image();

image.onload = function () {

destinationContext.clearRect(0, 0, destinationCanvas.width, destinationCanvas.height);

destinationContext.drawImage(image, 0, 0);

}

image.src = URL.createObjectURL(messageEvent.data);

} else {

document.getElementById("textresponse").value = messageEvent.data;

}

};

Details of Closing a WebSocket Connection

Similar to the opening handshake, there is a closing handshake. Either endpoint (the application or the server) can initiate this handshake.

A special kind of frame – a close frame – is sent to the other endpoint. The close frame may contain an optional status code and reason for the close. The protocol defines a set of appropriate values for the status code. The sender of the close frame must not send further application data after the close frame.

When the other endpoint receives the close frame, it responds with its own close frame in response. It may send pending messages prior to responding with the close frame.

Diagram illustrating the Close frame message and response.

Programming Closing a WebSocket and Registering for Close Events

The application initiates the close handshake on an open connection with the close API:

socket.close(1000, "normal close");

To receive notifications when the connection has been closed, the application must register for close events.

socket.onclose = function (closeEvent) {

document.getElementById("serverStatus").innerHTML = 'Web Socket State::' + 'CLOSED';

};

The close API accepts two optional parameters: a status code as defined by the protocol and a description. The status code must be either 1000 or in the range 3000 to 4999. When close is executed, the readyState attribute is set to CLOSING. After IE10 receives the close response from the server, the readyState attribute is set to CLOSED and a close event is fired.

Using Fiddler to See WebSockets Traffic

Fiddler is a popular HTTP debugging proxy. There is some support in the latest versions for the WebSocket protocol. You can inspect the headers exchanged in the WebSocket handshake:

Screen shot of Fiddler showing a WebSocket request.

All the WebSocket messages are also logged. In the screenshot below, you can see that “spiral” was sent to the server as a UTF-8 text message and echoed back:

Screen shot of Fiddler showing a WebSocket response.

Conclusion

If you want to learn more about WebSockets, you may watch these sessions from the Microsoft //Build/ conference from September 2011:

If you’re curious about using Microsoft technologies to create a WebSocket service, these posts are good introductions:

I encourage you to start developing with WebSockets today and share your feedback with us.

—Brian Raymor, Senior Program Manager, Windows Networking