Implementing custom load balancing with “sticky” connections on Windows Azure

Authors: Jonathan Doyon (Genetec), Konstantin Dotchkoff (Microsoft)


Running an app in a cloud environment is all about scalability and using the economics of the Cloud. Scaling out (or scaling back) provides a great flexibility to adapt to workload changes. Windows Azure provides load-balancing capabilities that can transparently distribute workload across multiple instances of an app. However, what should you do if your application is not completely stateless and requires session affinity (a.k.a. "sticky" sessions). Well, there are several solutions to this problem. In this blog post we describe a design pattern used to combine the requirements for load-balancing of a client workload with the need for sticky connections.

This pattern was implemented in the Stratocast solution by Genetec based on the specific application requirements. However, the pattern itself is a common one and applicable to a broader range of scenarios that call for sticky sessions.

Stratocast is a security solution that allows the user to view live and recorded video that is safely stored in Windows Azure from your laptop, tablet or smartphone. The solution allows the user to monitor multiple live or recorded video streams. Once a camera is connected to the cloud service it will keep streaming video. If for some reason a camera gets disconnected then it should reconnect to the same server component.

Even if we don't think about cameras and video streaming, you will find the same type of requirements in a lot of different scenarios. For the remaining part of the blog we will use "client" to refer to a client application or device (such as a camera) that accesses a server-side component running on multiple compute instances in Windows Azure.

So let's drill down into the details. Initially, a client will need to establish a connection to a server component running in Azure (in the case of Stratocast to the Azure-based video recorder). It will initiate an HTTPS call to the public endpoint of the Azure Cloud service. The endpoint is load balanced by the Azure load balancer and the call will reach one of the running instances.
The server component that receives the request will perform a custom load balancing based on the application specific business logic. In the case of Stratocast, for example, there are three important requirements:

  • Each camera is scored on its expected workload (i.e. "weight" associated with the client workload). Based on the expected incremental workload and the current utilization of the server pool, the custom load balancer should direct the video stream to the server that has enough capacity to accept it.
    Note that this is very important requirement, demonstrating that if different clients generate unequal workloads, a simple round-robin distribution of the load might not be the best choice.
  • The second requirement for custom load balancing in this scenario is to reconnect dropped connections to the same server instance. For that reason the app maintains a simple Azure storage table to record the information about which client is (or was) connected to which server instance. A fast lookup allows to identify if the incoming request needs to be routed to a particular server instance that already served that client.
  • And lastly, if the client opens multiple connections, those need to go to the same server instance. (Typically a camera opens one connection for video streaming and one for control purposes.)

While the first requirement is about load distribution, the second and third ones are related to the stateful nature of the communication.

After performing the custom routing logic, the server instance will redirect (using HTTP redirect) the client to the selected server instance by using its direct port endpoint. Once redirected the client will establish a connection that is stateful in nature (hence, all the requirements around session affinity). In the case of Stratocast, the connection will stay open for video streaming.

Let us expand a bit more on the details behind this solution. The Azure compute role hosting the server, (in the case of Stratocast- responsible to record video) has two public endpoints: an input endpoint and an instance input endpoint.

The input endpoint is a regular public endpoint automatically load balanced by Azure. Cameras always initiate communication on this port and then the system will issue an HTTP redirect to the appropriate instance input endpoint as determined by the custom load balancing logic explained earlier.

The instance input endpoint (a.k.a. direct port endpoint) is used to communicate with load balanced role instances. It requires a port range in the configuration file instead of just one public port (see Define Direct Port Endpoints for a Role for more details). Azure will automatically assign one port in the range to each instance. This endpoint can be used for direct communication with the instance - Azure does not load balance requests coming on the instance input port. When a role instance starts up, some code was added in the role to read the public port of the instance using the Azure API and to record it in an Azure storage table. This information is used later by the custom load balancing algorithm to redirecting calls to a specific server instance. The solution also monitors the health of each instance and keeps the list of server instances in the table storage up-to-date.

The following graphic shows the conceptual flow of the pattern:

Figure 1 – Custom load balancing pattern overview

For sure, this is not the only way to solve this challenge, but after reviewing and trying out some different approaches this pattern was found to be very useful for the described requirements. Using IIS Application Request Routing (ARR) is one alternative that can be considered. In comparison, ARR would provide the following:

  • Proxy-based routing with HTTP forwarding instead of HTTP redirect
    The internal forwarding of HTTP requests is extremely efficient and much faster than a HTTP redirect. For apps with a high rate of requests without a need to keep the connection open, ARR will provide faster experience. In the case of Stratocast the HTTP redirect overhead is negligible since it happens only at the beginning; once the connection is established it will stay open and there is no need to route additional requests for that connection.
    On the other hand, running IIS ARR introduces some extra cost in terms of resources on the server (which was avoided in the case of Stratocast).
  • Health monitoring
    ARR provides predefined ways to monitor the health of the servers. In the case of Stratocast, the app monitors the health of servers in a custom way and manages the list of available servers using a custom Azure storage table.
  • URL rewrite
    URL rewrite allows the ability to hide complexity and internal details from the client; the requesting client will never see rewritten URLs. This aspect was not relevant for Stratocast, but it might be a helpful feature for your application.

For a detailed list of ARR features you can take a look at Using the Application Request Routing Module.

In the case of Stratocast, due to the very dynamic nature of the routing information, an ARR-based solution seemed to be more complex with respect to the deployment automation, configuration and management.

In summary, the described pattern includes the following elements:

  • Configure the compute role to use a public load balanced endpoint as well as direct port endpoints that will be used to directly communicate with the role instances.
  • Allow initial requests to be load balanced through the Azure load-balancer.
  • One of the instances will perform the custom load balancing to ensure appropriate distribution of the load and "sticky" connections. We provided a good example of requirements for this, however the routing logic will be specific for each application.
  • After the "right" server component is identified, the client will be redirected to that instance of the service using the instance input port.
  • The client will establish a session that will stay open for a longer period of time until it is closed.

In this blog post, using Stratocast as an example we described a pattern for implementing custom load-balancing in Windows Azure. We also discussed which requirements led to this design and what should be considered when evaluating potential alternatives such as IIS ARR.

Comments (1)

  1. As per your blog " Azure will automatically assign one port in the range to each instance. This endpoint can be used for direct communication with the instance – Azure does not load balance requests coming on the instance input port."

    What happens to a request for a specific instance port and that instance is down ? Does the LB handle this situation gracefully ?



Skip to main content