I recently came across a case where customer has an issue that the Load balancer of his cloud service is not working in round robin basis. He confirmed that by seeing more number of requests going to few instances whereas the other instances are getting very less number of requests. To analyse the issue, we have collected the IIS logs from all the instances( Customer has 7 instances) and when we parsed the logs, we could be able to confirm the same that 2 instances are getting very less number of requests than the other 5 instances.
How Load Balancer works?
Load balancer uses 5 tuple algorithm to distribute client requests by default. This algorithm is a 5 tuple (source IP, source port, destination IP, destination port, protocol type) hash to map traffic to available servers. It provides stickiness only within a transport session. Packets in the same TCP or UDP session will be directed to the same datacenter IP (DIP) instance behind the load balanced endpoint. When the client closes and re-opens the connection or starts a new session from the same source IP, the source port changes and causes the traffic to go to a different DIP endpoint.
If most of the load goes to a single instance, the number one reason is due to the testing client creating and reusing the same TCP connections. The Azure loadbalancer does round robin load balancing for new incoming TCP connections, not for new incoming HTTP requests. So when a client makes the first request to the cloudapp.net URL, the LB sees an incoming TCP connection and routes it to the next instance in the LB rotation, and then the TCP connection is established between the client and the server. Depending on the client app, all future HTTP traffic from that client will may go over the same TCP connection or a new TCP connection.
In order to balance traffic across other Azure role instances the client must break the TCP connection and reestablish a new TCP connection. Load balancing HTTP requests would lead to existing TCP connections getting killed and new ones getting created. TCP process creation is a resource intensive process hence reusing the same TCP channel for subsequent HTTP requests is an efficient use of the channel.
If the client application is modified to make new TCP connection instead of HTTP requests (you can use multiple browser instances on the same client machine) then the TCP requests will end up on either Azure Instance in a round robin fashion.
So depending on how the clients are establishing TCP connections, requests may be routed to the same instance.
The Azure load balancer does round robin load balancing for new incoming TCP connections, not for new incoming HTTP requests.
By default, the BasicHttpBinding sends a connection HTTP header in messages with a Keep-Alive value, which enables clients to establish persistent connections to the services that support them. This configuration offers enhanced throughput because previously established connections can be reused to send subsequent messages to the same server.
However, connection reuse may cause clients to become strongly associated to a specific server within the load-balanced farm, which reduces the effectiveness of round-robin load balancing. If this behavior is undesirable, HTTP Keep-Alive can be disabled on the server using the KeepAliveEnabled property with a CustomBinding or user-defined Binding. The following example shows how to do this using configuration.
<?xml version="1.0" encoding="utf-8" ?>
<add scheme=”http” binding=”customBinding” />
<!-- Configure a CustomBinding that disables keepAliveEnabled-->
Note: This is for a WCF application where multiple clients will make use of the services hosted in cloud.
Load Balancing - https://msdn.microsoft.com/en-us/library/ms730128(v=vs.110).aspx