The suggestions in this blog post are from countless reviews of IIS server farms for performance related issues and by analyzing the IIS logs and noticing patterns on how the load balancer(s) can impact the performance of the farm by creating additional load negatively impacting the performance of the web applications running on the IIS servers.
A hardware or software load balancer is an essential component for an IIS web farm. The load balancer provides scalability to a web application by properly distributing the load to the servers in the farm and high availability by allowing an IIS server to be taken offline in a controlled manner without jeopardizing the user's session. There are several configuration settings for a load balancer that can impact the IIS environment and here are a few suggestions.
Load Balancer Health Check Request Configuration:
The health check request feature in a load balancer is essential in determining if a web application and/or server is online. The health check request configuration may vary per load balancer manufacture, but the principles are the same. The health check request feature is configured to request a page within the web application and expects an HTTP response code, usually a 200, or a value within the HTML markup to indicate a success.
All too often, I have seen the health check page request set to the root of a web site, or IP address of the IIS server, and not call into a specific page. This is determined by running a Log Parser query to request all calls to csuristem ="/" on the website. Sometimes, the result of calling into the root of a web application without specifying a specific page is IIS server will return a 403.14 forbidden Directory listing denied error code. This error is generated due to the website not having a default page specified and IIS will not allow for the a directory list of files presented in the browser, which is a security violation. This can cause a burden on a server handling a heavy load as IIS must iterate the default document list for the website, scan the hard drive for the pages, determine a default document is not present, log the error, and return the error code to the requestor.
Specify a page to call within the website and search on text within the page to ensure the page is properly returned by IIS. This will ensure the Windows Activation Service (WAS) is running and the most basic operation of returning a static HTML page is operational and will alleviate the 403.14 HTTP error code from being logged.
See the Log Parser queries at the end of this post to determine the traffic created by the load balancer.
For a more advanced setup, see the next recommendation.
Health Check Page to Request on IIS:
Most IIS servers will host dynamic content through an ISAPI extension, such as ASP.net, to return data from a web service or database. To effectively monitor a web application hosting dynamic content, add a very basic ASP.net page with simple code to execute and return a value in the page the load balancer will read from to ensure the web application is responsive. This will test the ASP.net compiler and the request will move past the Static Page handler module into the ISAPI extension and the application pool and is a deeper test than just returning a static page.
Please take careful consideration if the code in the health request page is calling into a database, or web service, to reduce the risk of the load balancer overwhelming the resources during burst load periods. It is best to coordinate with the development team to determine the best method to test the web application is responsive and not impact the application.
Frequency of Requests for the Health Check Page:
There is not a specific formula to determine the exact setting for the health check request frequency and most load balancers will offer to run the health check request operation every 5 seconds. It is easy to assume a 5 second request interval is the right choice for the web site as you don't want the users of your web application to see the server offline. However, being too aggressive on this setting can cause unnecessary load on the server by turning the load balancer into the largest consumer of the website instead of a monitoring utility. If the load balancer is issuing more than 20% of the total requests on the site during business hours, it is time to reconsider a less aggressive health request interval.
The goal is to ensure the load balancer is not contributing to the excessive load on the web site and exacerbating the web site's response time when it should just be monitoring to ensure the web site is online and responsive. This requires analysis of the IIS logs to determine the user load patterns (peak, burst, normal) and setting the interval accordingly.
Generally, a 30 second health check page interval response time with a 15 second timeout is sufficient for most web applications. Again, this depends on the organization and the criticality of the web site and what the organization feels comfortable with in the interval. The point is not overload the web application with excessive requests from the load balancer to the point where it impedes on the web site's ability service the requests from the users.
Is the Web Application Really Load Balanced:
Yes, having a load balancer in the environment is the first step in creating a highly available web farm. However, if the load balancer is set to persistent, or sticky sessions for a web application, the web application is not truly load balanced due to the fact the user's session is locked onto the same server instead of the load balancer determining the server with the least amount of activity. This can become an issue if a web application requires in-proc session state and during a burst load period a set of users are running highly intensive, or long running requests, and the load balancer continues to send the users to same IIS server. There is the potential for the web application's request queue to back up due to the activity within the web application and the user's could experience timeouts and other issues. A web application must utilize out-of-proc session state, or no session state, in order for the load balancer to effectively parse the requests to the servers in an efficient manner to help the web application scale. As an IIS admin, always ask the development team if this web application can use out-of-proc session state and setup the load balancer to distribute load and not use sticky sessions.
It is also a good idea to understand the type of session state being utilized by the application (SQL or ASP.net Session State Service) and where the sessions state service is located in case that server needs to be rebooted and how it could impact the IIS farm.
Log Parser Queries:
Determine the total number of requests from load balancer on the server:
Logparser "SELECT count(c-ip) AS Count from *.Log where c-ip = 'xx.xxx.xx.xxx' GROUP BY c-ip ORDER BY count(c-ip) DESC" -i:IISW3C -o:datagrid
Number of requests per hour from Load Balancer to Slash:
Logparser "SELECT Date, TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date,time),3600)) as TheHour, COUNT(*) as ReqCount, sc-status from C:\Logs\*.log WHERE cs-uri-stem = '/' GROUP BY Date, TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date,time),3600)), sc-status ORDER BY [ReqCount] DESC" -o:datagrid -i:iisw3c