HTTP duplex messaging improvements in Silverlight 4

One of the new features in the WCF stack in Silverlight 4 is an improvement of our PollingDuplex binding, which allows duplex communication over HTTP. HTTP is a request/reply communication medium, so some tricks are needed to make it look like a duplex transport, where both the client and the server can send messages to each other.

Check out our MSDN tutorial for a walkthrough on how to build a service using PollingDuplex in Silverlight 4 using Visual Studio 2010. If you prefer a video, check out this episode of Silverlight TV, where our dev lead Tomasz demonstrates the feature (you can ignore the net.tcp part).

Silverlight 4 adds a new mode to the existing PollingDuplex binding, which enables significantly higher throughput in low-latency scenarios. The throughput in some scenarios can even be compared to TCP’s throughput, although the server-side scalability still does not match that of TCP.

We call the new mode MultipleMessagesPerPoll mode. To explain what this new mode does, we should first go through the wire protocol used by PollingDuplex in SL2 and SL3. The following diagram illustrates that protocol.

pd_single

Requests from the client are sent in a straightforward manner, and the server’s response is delivered on the same HTTP connection where the request came in. This is how client-to-server-communication works. To enable the server to send messages to the client, the client polls the server and the server can use the poll response as a way to send a single to the client. The client cannot reply to these requests. This is how server-to-client messaging works.

One thing to note in this model is that the server will send only a single message on every poll response. That means that if the messages for the client are being queued up quickly on the server (a few per second), a backlog might start to appear, because the server has to wait for a new poll to come in to return every message. Basically the overhead of establishing HTTP requests becomes the bottleneck for streaming messages to the client.

We have addressed this problem in the new MultipleMessagesPerPoll mode. We use the HTTP response streaming capability in the underlying networking stack to send multiple messages on a single poll response, as shown in this diagram:

pd_multiple

The messages will be separated by a lightweight framing protocol. We will try to write all messages that are available on the server on the current poll response, and we’ll try to keep that response open for as long as possible. Eventually the response will be terminated (sometimes due to network intermediaries), after which the client establishes the new poll, and again we try to hold on to the response and write to it for as long as possible.

One interesting side effect of the new behavior, is that responses to client requests will also be sent on the poll response, instead of the HTTP connection where we received the request. Requests from the client will now return immediately with an empty HTTP 200 status code… the actual server response message will arrive on the poll response.

To enable the new mode, no changes are needed to the PollingDuplex service implementation itself. Just use the following binding config:

  1. <customBinding>
  2.   <binding name="pollingDuplexBinding">
  3.     <pollingDuplex duplexMode="MultipleMessagesPerPoll" maxOutputDelay="00:00:01" />
  4.     <binaryMessageEncoding />
  5.     <httpTransport transferMode="StreamedResponse" />
  6.   </binding>
  7. </customBinding>

Note the use of the duplexMode property on the <pollingDuplex /> binding element, and the use of the transferMode property on the <httpTransport /> binding element.

There is one interesting consideration around buffering. WCF will buffer streamed responses in chunks of 16KB if the service is self-hosted, or 32KB chunks if the service is hosted in IIS. This is a built-in behavior that cannot be overridden by users. So as the service call queues up messages for the client, those will be “flushed” out of WCF in 16/32KB chunks before they are pulled off the stream by the Silverlight client.

If sending small messages (less than this buffer size), you may end up in a situation, where a message is sitting inside the WCF pipeline, waiting to be flushed out to the stream. If the timeliness of the delivery in such scenarios is important, check out the maxOutputDelay knob shown above. This knob determines the maximum amount of time a message will sit in the WCF pipeline before we force it to flush. Unfortunately the way to flush the message is to close the streamed response connection, which eliminates some of the benefits of this new mode. This is especially true since the default value of the setting is 200 ms.  So the golden rule to take maximum advantage of the streaming behavior us: always go in and change maxOutputDelay to the maximum delay that is acceptable in your particular scenario. In a chat application, for example, that number is probably around 1-3 seconds. In a news ticker scenario, the number could be in the order of 10s of seconds.

I hope this sheds some light on this performance optimization that we have added in Silverlight 4. MultipleMessagesPerPoll mode should be the new default setting to use in all new PollingDuplex applications.

Cheers,
-Yavor Georgiev
Program Manager, WCF