Why Clicking a hyperlink can result in multiple requests to web server(s)


Question:


Dear Experts,


If you could share your opinions or point out some reference links, I do appreciate!


My web server: IIS and SharePoint Portal server 2003


My qestion is: After I click one hyperlink in a page (ASP.NET site) using IE, why there are 3 requests recorded in the IIS log file?


Example:
2006-04-25 20:16:54 (IP) GET /_layouts/1033/Viewer.aspx
contentId=acbdef.htm 80 – (IP)
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322)
SITESERVER=ID=(server_id) http://localhost/Manager.aspx 401 2
2148074254 1912 15


2006-04-25 20:17:02 (IP) GET /_layouts/1033/Viewer.aspx
contentId=acbdef.htm 80 – (IP)
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322)
SITESERVER=ID=(server_id) http://localhost/Manager.aspx 401 1 0 2148 15


2006-04-25 20:17:20 (IP) GET /_layouts/1033/Viewer.aspx
contentId=acbdef.htm 80 (domain)\(my_id) (IP)
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322)
SITESERVER=ID=(server_id) http://localhost/Manager.aspx 302 0 0 835
17671


Thank you very much!


Answer:


The IIS log file contains three entries because the browser actually made three requests on behalf of your single click of a hyperlink. Modern browsers are “smart” and do many things “behind the scenes” to ease the end-user experience.


In other words, you cannot assume that every click of a hyperlink translates into exactly one request to a web server.


Here are same examples of situations where you make one logical click of a hyperlink but the web browser makes multiple requests to web server(s) to satisfy the request.


Consider what happens when you click on a link to…


An HTML page which includes links to Pictures


You may have only clicked on the link to the HTML page, but the browser parses the HTML page, notices 10 pictures in IMG tags, makes 10 additional requests to web server(s) to retrieve those resources, and finally renders the HTML page composed with those pictures.


Similar behavior happens for referenced resources of an HTML page like CSS Stylesheets, client-side Javascript, etc… because the browser must retrieve those URL resources to properly render a given HTML page.


Clearly, to satisfy you making a single click of a hyperlink pointing to a single HTML page, the browser has to make MANY requests on your behalf to give you a desirable user experience of a properly rendered HTML page.


An HTML page that results in a 302 Redirection


You may have only clicked on the link which causes a 302 redirection response to be returned, but do you expect the browser to:



  1. Ask you whether to follow the 302 redirection?
  2. Automatically make a new request to follow the 302 redirection assuming it is “safe”… and continue following the 302 redirections, automatically, until a non-redirection response is retrieved

Most browser implementations default to #2, which means that for a single click, the web browser may automatically make MANY requests on your behalf, trasparently, to resolve through the redirections.


Users expect browsers to follow redirections and render the final result, not repeatedly ask for permission to follow redirections and not render the end result.


An HTML page that requires Authentication


You may have only clicked on the link which requires authentication, but do you expect the browser to:



  1. Show you the 401 response since the URL requires authentication?
  2. Immediately show you the username/password dialog?
  3. Depending on security settings, either automatically attempt authentication via acceptable protocols to the web server or show the username/password dialog, correctly authenticate to the web server, and retrieve and display the secured resource?

Most browser implementations default to #3, which means that for a single click of a hyperlink to a resource that requires authentication, the web browser may automatically make additional requests and interpret their responses to negotiate authentication with the web server in reaction to getting a 401 response with WWW-Authenticate headers.


Users expect browsers to eliminate as many of the username/password popup dialogs as possible by automatically making additional requests to authenticate and retrieve the resource for them wherever it makes configured security sense… and not just repeatedly ask for user credentials to access resources.


Conclusion


In your case, you are observing the fact that:



  1. The /Manager.aspx resource requires authentication
  2. Thus, when you clicked on the link, it made the browser make an anonymous request to the server, and it got a 401.2 response back.

    Making an anonymous request is the default behavior for a browser and makes sense. A browser has no way to know WHAT authentication protocols a given website requires BEFORE making that first request (software cannot tell the future… yet 😉 ), so the most obvious choice is to make an anonymous request and see what the web server does in response.
  3. Since the website is in a security zone that allows the web browser to automatically authenticate, the web browser attempts to authenticate using the specified protocol(s) of the web server. My guess is that only Integrated Authentication is enabled on the website containing /Manager.aspx, so you see the second leg of that protocol negotiation sequence and a 401.1 response
  4. Finally, the client successfully authenticates with IIS and IIS starts to execute the /Manager.aspx resource (no more 401 errors logged), but that resource actually ends up sending a 302 redirection … so a 302 Redirection response is sent back to the client. At this point, the browser probably automatically follows that redirection to something else, and that process continues.

Hopefully, you now see that Modern browsers do many things transparently to make the end-user’s life incredibly simple and easy – simple, logical actions that you do with a single click, like browsing a normal HTML page with pictures, or accessing an authenticated URL, or following a 302 redirection… all require additional requests and negotiation logic that the browser transparently performs on your behalf.


Thus, one may be surprised to see that a single logical click of a hyperlink results in multiple requests to servers… but I can only say that it is perfectly normal. :-)


//David

Comments (4)

  1. P.L. says:

    Thanks for your detailed explanation. It is very clear to me, so far I know what happened on my web server.

  2. David Wang says:

    I finally have enough blog entries about various portions of IIS6 request processing that I can stitch…

  3. FrankT says:

    My question is this: If you want to redirect to a logon URL inside of an ISAPI filter, and the the logon page itself that you are redirecting to has stylesheets, js file includes, etc… that cause the other http get’s you mentioned above, how do you not get into an infinite loop? Because when my isapi filter sees the first request, it redirects to the logon page, but the logon url fetches images, stylesheets and some js files which once again executes the ISAPI filter and redirects to the url again…and again…and again….  Is there any variable in the server variables collection, etc…that can tell me what the other gets are related to an "original url" that was requested to avoid this infinite loop problem?  I hope this makes sense…

    Thanx in advance.

  4. Ken says:

    Is there any way to turn that multiple requests by web browsers down to one single request? I've experience this as well, but worse case than the person who asked question. Many users are downloading big files (100MB) from my HTTP web server and instead of 1, every users requests like 5. Every single bit of data is transferred 5 times. By GET command, not HEAD.  To download 100MB of data, 400MB additional bandwidth is wasted for nothing, per user. This happens on IE only, FF or Chrome don't do this. IE might think it is smart, but I don't :(