Bugs in IE8’s Lookahead Downloader


All bugs mentioned in this post are now fixed. 


Internet Explorer has a number of features designed to render pages more quickly. One of these features is called the “Lookahead Downloader” and it’s used to quickly scan the page as it comes in, looking for the URLs of resources which will be needed later in the rendering of the page (specifically, JavaScript files). The lookahead downloader runs ahead of the main parser and is much simpler– its sole job is to hunt for those resource urls and get requests into the network request queue as quickly as possible. These download requests are called “speculative downloads” because it is not known whether the resources will actually be needed by the time that the main parser reaches the tags containing the URLs. For instance, inline JavaScript runs during the main rendering phase, and such script could (in theory) actually remove the tags which triggered the speculative downloads in the first place. However, this “speculative miss” corner case isn’t often encountered, and even if it happens, it’s basically harmless, as the speculative request will result in downloading a file which is never used.


IE8 Bugs and their impact
Unfortunately, since shipping IE8, we’ve discovered two problems in the lookahead downloader code that cause Internet Explorer to make speculative requests for incorrect URLs. Generally this has no direct impact on the visitor’s experience, because when the parser actually reaches a tag that requires a subdownload, if the speculative downloader has not already requested the proper resource, the main parser will at that time request download of the proper resource. If your page encounters one of these two problems, typically:



  • The visitor will not notice any problems like script errors, etc

  • The visitor will have a slightly slower experience when rendering the page because the speculative requests all “miss”

  • Your IIS/Apache logs will note requests for non-existent or incorrect resources

If your server is configured to respond in some unusual way (e.g. logging the user out) upon request of a non-existent URL, the impact on your user-experience may be more severe.


The BASE Bug

Update: The BASE bug is now
 
fixed.

The first problem is that the speculative downloader “loses” the <BASE> element after its first use. This means that if your page at URL A contains a tag sequence as follows:



<html><head><base href=B><script src=relC><script src=relD><script src=relE><body>


which requests 3 JavaScript files from the path specified in “B”, IE8’s speculative downloader will incorrectly request download of URLs “B+relC”, and “A+relD” and “A+relE”. Correct behavior is to request download of URLs “B+relC”, “B+relD”, and “B+relE”. Hence, in this case, two incorrect requests are sent, usually resulting in 404s from the server. Of course, when the main parser gets to these script tags, it will determine that “B+relC” is already available, but “B+relD”, and “B+relE” have not yet been requested, and it will request those correct two URLs and complete rendering of the page.


At present, there is no simple workaround for this issue. Technically, the following syntax will result in proper behavior:



 <html><head><base href=B><script src=relC><base href=B><script src=relD><base href=B><script src=relE><body>


…but this is not standards-compliant and is not recommended. If the page removes its reliance upon the BASE tag, the problem will no longer occur.

Remember: The BASE bug is now fixed.


The Missing 4k Bug


Update: The 4k bug is now fixed. 


The second problem is significantly more obscure, although a number of web developers have noticed it and filed a bug on Connect. Basically, the problem here is that there are a number of tags which will cause the parser and lookahead downloader to restart scanning of the page from the beginning. One such tag is the META HTTP-EQUIV Content-Type tag which contains a CHARSET directive. Since the CHARSET specified in this tag defines what encoding is used for the page, the parser must restart to ensure that is parsing the bytes of the page in the encoding intended by the author. Unfortunately, IE8 has a bug where the restart of the parser may cause incorrect behavior in the Lookahead downloader, depending on certain timing and network conditions.


The incorrect behavior occurs if your page contains a JavaScript URL which spans exactly the 4096th byte of the HTTP response. If such a URL is present, under certain timing conditions the lookahead downloader will attempt to download a malformed URL consisting of the part of the URL preceding the 4096th byte combined with whatever text follows the 8192nd byte, up to the next quotation mark. Web developers encountering this problem will find that their logs contain requests for bogus URLs with long strings of URLEncoded HTML at the end.


As with the previous bug, end users will not typically notice this problem, but examination of the IIS logs will show the issue.


For many instances of this bug, a workaround is available– the problem only appears to occur when the parser restarts, so by avoiding parser restarts, you can avoid the bug.  By declaring the CHARSET of the page using the HTTP Content-Type header rather than specifying it within the page, you can remove one cause of parser restarts.


So, rather than putting



<META HTTP-EQUIV=”Content-Type” CONTENT=”text/html; charset=utf-8″>


In your HEAD tag, instead, send the following HTTP response header:



Content-Type: text/html; charset=utf-8


Note that specification of the charset in the HTTP header results in improved performance in all browsers, because the browser’s parsers need not restart parsing from the beginning upon encountering the character set declaration. Furthermore, using the HTTP header helps mitigate certain XSS attack vectors.


Unfortunately, however, suspension of the parser (e.g. when it encounters an XML Namespace declaration) can also result in this problem, and it’s not feasible for a web developer to avoid suspension of the parser.


But, remember: The 4k bug is now fixed. 


Summary
While these problems are significant, they are not so dire as some readers will conclude at first glance. The second bug, in particular, is quite rarely encountered due to its timing-related nature and the requirement that page have a JavaScript URL spanning a particular byte in the response. Encountering the second issue is not nearly as prevalent as some web developers believe– for instance, we’ve heard claims that IE6, 7, and Firefox all have this problem, which is entirely untrue. Readers can easily determine if a page is hitting either bug by examining server logs, or watching network requests with Fiddler.


The IE team will continue our investigation into these bugs and, as with any reported issues, may choose to make available an IE8 update to resolve the issues.


Remember: All bugs mentioned in this post are now fixed. 


Apologies for the inconvenience, and thanks for reading!


-Eric

Comments (116)

  1. Scott S. says:

    Thank you for tracking this issue.  We are experiencing this regularly (approx once every 2 hours) on our high volume ASP.NET site.

    Outside of the charset change you mentioned, is there anything else web developers can do to prevent parser restarts in IE8?

    Any insight is appreciated.

  2. ieblog says:

    @ScottS: I assume you’re saying you hit the 4k issue, and not the "BASE" issue, as the latter will occur reliably on every request.

    If you can send me the HTML (or a network capture: http://www.fiddler2.com) of the affected page, I’ll have a look to see if there’s any other cause for the parser restart. Email me at microsoft.com, username ericlaw

  3. baltazzarr says:

    The second bug is an issue for all developers who align their code to 4096 bytes in order to make it run faster on Win98 :o)

  4. DmitryK says:

    Thank you Eric for such a comprehensive analysis. I emailed you a few weeks ago about this issue. This is great to see that you guys found the root cause of this problem that is indeed obscure.

  5. Tim McGookey says:

    Wow!  That Missing 4K bug seems to be exactly the problem I’ve been trying to solve on my high traffic ASP site.  I’ve spoken to Eric Lawrence during the IE8 expert chats, and am incredibly glad he was able to figure this guy out.

  6. Olivier Jaquemet says:

    Glad to see you tracked down this bug. Thanks for your detailed explanation on all this.

    Do you have an estimation on when the fix will be available ?

  7. Olivier Jaquemet says:

    Also beware that the problem can also occurs for resources other than JavaScript.

    For example CSS and shortcut icon :

    <link rel="stylesheet" href="path-to-stylesheet.css" type="text/css" media="all" />

    <link rel="shortcut icon" href=’path-to-favicon.ico’ />

  8. FasyHack says:

    I had

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

    in my HEAD tag, and  using Wireshark I could see IE8 requesting the external JavaScript + CSS files only to immediately send a "TCP Reset" to kill the sockets, and then requesting the files again….

    changing the meta to :

    <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>

    fixed the problem….

  9. CG Monroe says:

    The lack of standard BASE compliance (in a browser touted as being more complient) is a show stopper bug for us.

    I just spent a LOT of time try to figure out why our authentication system was returning a 404 error… and it’s because of this bug.

    Here’s a common situation that this bug breaks:

    A secure site with only files in the /login directory viewable by the public.

    When a secure file is asked for, the user is redirected to a login page via a "/servlet/login" request.  

    The login page has a BASE tag to get all the CSS, images, etc from the /login directory.  All of which have no security attached.

    However, the IE8 (let me tell you where to go) optimistic code, uses the "/servlet/login" to create a secured non-existant URL, e.g. /servlet/login/login.css.

    AND IT MAKES THIS REQUEST BEFORE the user entered URL is requested.

    So, when the user asks for /foo.html, IE8 actually asks the system for /servlet/login/login.css first.  The security system does what it’s supposed to… says, I need you to authenticate before I can confirm or deny this files existance.

    The user login in, thinking they are going to /foo.html, but the authentation system has been given a non-existing URL first.

    The user see’s a 404 file not found error, and our support lines light up because they have been told to get /foo.html (wich they can if the request it after authenticating).

    Sorry to sound bitter but this has caused us a lot of grief, especially since IE8 is a highly recommended update.  

    IMHO, this would be marked as a SHOWSTOPPER or CRITICAL bug in anyone issue system.  PLEASE fix it MS.

    Before you say "just fix the login page…"  we use this in 80+ sites used by our clients (mostly fortune 500 companies).  Time is money… and IE8 is costing us by not being compliant with the standards.

  10. @CGMonroe: The BASE issue has nothing in common with standards compliance in general. As stated, this bug is a regression, pure and simple.

    I’m not sure I understand the rest of your message. IE makes an invalid speculative download request, which fails. Then, later, it makes a non-speculative download request for the proper URL.

    It’s not at all clear why this is a "show stopper" bug for you. Does your 404 response delete the user’s cookies or something like that?

    If you have a public URL, I’d be happy to take a look.

  11. @CG: Comments with URLs in them are often blocked as spam.

    However, before you resend, please understand two factors:

    1> This is a bug. A plain old boring bug. Bug. You need not bother quoting the HTML spec to show that it’s a bug. Everyone agrees that it’s a bug.

    2> You seem to assume that the user will see or be redirected to an error page due to this bug. That’s not the case, as the result from the failed lookahead download is simply discarded because the parser uses the proper URL.

  12. CG Monroe says:

    Something in my long reply is causing the comment system to choke.. probably as designed.

    Anyway, here a URL to what I was trying to post here:

    http://dl.getdropbox.com/u/1701672/IE8-Base-bug.txt

  13. As noted, please see my comment above.

  14. CG Monroe says:

    Sigh… but I have documented and personally seen that in the case of a secure site, this bug WILL cause the display a 404 to users… because of this bug.  But, you’ve made up your mind.. so I’ll move on.

  15. @CG: I’m more than happy to look at any repros, but if you see a redirection to a 404 page, it’s not related to this bug.

    http://www.fiddlercap.com shows how to gather a network capture for me to look at, if it’s not a public site.

  16. CG Monroe says:

    In thinking on how to collect info on this bug, I realized that this problem can be reproduced using the standard J2EE web app security constraints.

    Here’s a demo site with downloadable code (since none of it’s proprietary).

    http://www.skylarking.org:8180/ie8-bug

    Note the port number and that since it’s running on my personal server via RR, I may pull the plug on this in a few weeks or so.  

  17. @CG: I’m not sure I understand your repro.  Here’s the HTTP Traffic:

    ——–

    GET http://www.skylarking.org:8180/ie8-bug/private/

    200 OK

    GET http://www.skylarking.org:8180/ie8-bug/login/login.js

    304 Not Modified

    [**** INCORRECT LOOKAHEAD DOWNLOAD HERE ****]

    GET http://www.skylarking.org:8180/ie8-bug/private/login.css

    200 OK

    [*** CORRECT URL RETRIEVAL ***]

    GET http://www.skylarking.org:8180/ie8-bug/login/login.css

    304 Not Modified

    [*** YOUR CODE REDIRECTS TO AN ERROR PAGE ***]

    POST http://www.skylarking.org:8180/ie8-bug/login/j_security_check

    302 Moved Temporarily to http://www.skylarking.org:8180/ie8-bug/private/login.css

    GET http://www.skylarking.org:8180/ie8-bug/private/login.css

    404 Not Found

    ——–

    As you can see, the "j_security_check" page redirects, using a HTTP header, to an invalid URL. Hence, it is your server that is navigating the browser to the incorrect target URL.

    Why it does that, I cannot tell for sure. My assumption would be that there’s a session variable on the server that keeps track of the last "protected URL" requested and navigates the user to that page after login, but such a feature is under the control of the server, not the client. Such an architecture is not common, and as I noted above: "If your server is configured to respond in some unusual way (e.g. logging the user out) upon request of a non-existent URL, the impact on your user-experience may be more severe." From the looks of it, this applies to your design.

  18. CG Monroe says:

    Yes, the session tracking the last protected URL is the behaviour that is happening.  You get the same behaviour in other browsers if you put a link statement to a protected css file on the login page.

    But some important points here are:

    First, the authentication method is not MY code… this example is based on Tomcat’s implementation of the Java Servlet API’s security definitions. Most likely, the same error will occur on any standard Java webapp platform, e.g. JBoss, Websphere, and the like.

    The use of a BASE tag for "view" elements like the login.jsp example is not that uncommon.

    The HTML generated does refer any anyway to the invalid login.css URL… It’s written correctly to all standards. The tracking of the last called protected URL is not the bug, it is IE’s disreguard of the Base tag that is causing the problem.

    Finally, you asked a while back, why should this be considered a show stopper or critical bug… hopefully, this discussion has shown that this bug is not just a "nuisance" bug but it can cause problems for a wide set of web applications that require security.  I would hope that this discussion finds it’s way into the MS bug priority setting and that a this problem will be fixed in an update cycle.

    Oh, should have said this sooner.. Thank you for finding the underlying IE bug in the first place!

  19. @CG: Thanks for the clarification.

    If the architecture used on your site was widespread, it’s unlikely that it would have taken this long to discover; it would have likely been discovered in one of the beta cycles.

    As it stands, it’s still a significant bug, both for the corner cases where there is an end-user impact, but also for the performance implications. Lookahead downloading is intended to make the browser faster, not slower. 🙂

  20. Harry Mantheakis says:

    My beautiful, long-in-the-making, highly admired, highly sophisticated, super secure business application has suddenly been rendered unusable by the ‘base tag’ bug.

    At least there is Firefox…

    Is there going to be a fix (?)

  21. OliD says:

    CG: before the days of stick everything in session variables, typically you rediected people to /login?ref=/secureUrl.

    This is also causing us issues where its causing an extra 20 requests ( for all the .js files ) to our app servers for each page request.

  22. hofstee says:

    The Missing 4k Bug:

    The article states: "By declaring the CHARSET of the page using the HTTP Content-Type header rather than specifying it within the page, you can remove one cause of parser restarts."

    Eric in an email:

    "Unfortunately, another known cause of parser restarts is use of XML namespaces, which your site appears to use." So if you use XHTML the 4K issue can occur!

  23. David says:

    The BASE-tag bug is causing our customers much grief as well. Our system is based on the BASE tag as our pages is rendered by a servlet which is located in another directory than the css- and js-files that we refer to in the head-tag on all pages.

    The third-party servletexec that we use is logging every request for a non-existing file as an invalid call for a non-existing class. The user is not affected by this, but our server is being bogged down by the enormous amount of logging that we can not disable. Introducing extra BASE-tags in our pages is not an alternative, and non-standard at that.

    A "smart" downloader is only smart if the number of requests sent to the server is the same or reduced, but this generates a lot more requests than necessary.

  24. <<A "smart" downloader is only smart…>>

    David, let me reiterate because clearly you missed it:

    This is a bug. A plain old boring bug. Bug. Bug. Bug bug bug.

  25. David says:

    No, I didn’t miss the "This is a bug"-part…

    So, when are we going to see a fix for this, not only boring bug, but a bug killing business for a lot of people?

  26. Bob says:

    So much drama here.

    Harry/David: If one bug in one browser "kills" your business, I think your business suffers from some more fundamental problems.

    It sounds like you already know how to workaround this problem, and refuse to do so.

  27. David says:

    @Bob – Why should we not "refuse" to redesign our solution and deploy new code to our customers that doesn’t follow standard in order to counter a problem in a faulty browser?

  28. Bob says:

    David: You’re missing the point.

    There are two possibilities:

    1> This bug doesn’t "kill" your business, and you were just being sensational/dramatic.

    2> This bug does somehow "kill" your business. In this case allowing your business to die because you’re married to "standards" ("to death do you part") is an option, although a rather unusual choice.

  29. CG Monroe says:

    Drama or not… Possible to work around the bug or not…

    The underlying issue is still if or when MS will fix this. As with any bug, the question of priority comes up and that is based on a lot of factors including impact on your users.

    There is also a question of commitment to your own company/product’s stated goals.  There was much fanfare with IE about it being more standards complient and developers not needing to do all the tricks they needed to do to get things to "behave" the same across different browsers.

    There always are fixes to browsers misbehaving.. but it has been illustrated here that IE8 has yet another quirk that developers have to either know about or find the hard way when code written to standards doesn’t work.

    I hope the people who rank and priortize bugs are not being blinded by the drama, but are thinking about MS’s stated commitment to have IE be standards complient (so we all start complaining about FF/Chrome compliant bugs 🙂 ) and the fact that it’s been shown to be more than some extra 404 entries in logs.

  30. Dan says:

    The ISSUE is how much time you waste in trying to determine if the bug is in the browser, or your software, and then develop a fix.

    IE yet again proves to be a browser with some evil bad gotchas, which waste valuable developer time.

    IE 8 is just another release which proves the old adage "Lets make the site work for standards compliant browsers, and then we will fix it for IE"

    Too bad you can’t invoice MS for your wasted hours.

    I’m having to track down a different IE-8 related bug… 😛

  31. Tim McGookey says:

    Eric-

    is there a more exhaustive list of tags that can cause the lookahead downloader to reset?  Our web-app doesn’t use the BASE tag, and rarely uses the meta tag to set the charset, which we’re already specifying in the HTTP headers.  I’d love to reduce the amount of spam our error tracking software generates, thanks to the 404s, and every little bit helps.

  32. EricLaw [MSFT] says:

    @Tim: Unfortunately, as @hofstee mentioned, I believe that XML Namespace declarations (commonly used in XHTML) also trigger the restart logic.

  33. Tim McGookey says:

    @EricLaw

    Are the base tags, the meta tags, and the XML declarations all that we know so far?  What I’d love to be able to do is go to my developers with a list of problematic tags and say "fix these, and the e-mails will stop."

  34. EricLaw [MSFT] says:

    @Tim: Unfortunately, I wouldn’t feel confident in suggesting that XML Namespaces and META tags are the only cause of restarts, because I know very little about the overall parsing architecture.

    The BASE tag is obviously the biggest cause of incorrect requests, because the incorrect speculative requests due to BASE are not at all related to timing.

    While unfortunately I’m not able to make any statements or speculations about IE code fixes (either availability or timeframe) I can say that this is an issue that we’re getting a significant amount of customer escalations about because the workarounds are unappealing.

  35. Tim McGookey says:

    @EricLaw:

    I appreciate everything you’re doing for this problem.  Thanks for your help.

  36. Randy Syring says:

    And this bug will be fixed **when**?

    Thanks.

  37. EricLaw [MSFT] says:

    Sorry Randy, as mentioned previously:

    Unfortunately I’m not able to make any statements or speculations about IE code fixes (either availability or timeframe). I can say that this is an issue that we’re getting a significant amount of customer escalations about because the workarounds are unappealing.

  38. Randy Syring says:

    Eric,

    Sorry, I didn’t catch that.  Thanks for responding.

  39. Martin K says:

    Eric:

    I appreciate you taking the time to investigate this issue and help the community.  I saw a post you made several months ago like so:

    "If you can send me the HTML (or a network capture: http://www.fiddler2.com) of the affected page, I’ll have a look to see if there’s any other cause for the parser restart. Email me at microsoft.com, username ericlaw"

    Here is my issue, we are definately getting the 4096 K bug.  Our site is ASP.net 3.5 and we get the invalid viewstate error on DecryptStringIV, etc.  I can tell you for a fact this ALWAYS seems to focus around ScriptResource.axd and WebResource.axd.  We have a standard master page, css, etc.

    The issue I’m having is its intermittent, occurs on various pages, so I can’t send you a single HTML snapshot of the problem, since I also personally can’t get it to fail.

    Can you provide any insight, we do have a development server with a public ip where the full application is running.  Should I sent this to you via email? Can you help me? Given the facts I have presented what is the best way for us to combat this error.

    Thanks

  40. @Martin: Does your page use a META CHARSET tag that specifies a character set?  Do you specify any namespaces?

    In terms of workarounds, I can think of several unappealing ones (e.g. use a comment at the top of the page that pushes all of the relevent script tags out of the 4096th byte).

  41. eeyore145 says:

    Eric:

    Thanks for your reply.  We did a very simplistic/clean overall design for the web site.  We basically have a base aspx page which all other pages derive from.  We have a single Master Page that is used for all pages, etc.  We have all logic encapsulated within User Controls, ascx, etc.

    Regarding your question.  In our Master Page this is defined:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;

    <html xmlns="http://www.w3.org/1999/xhtml&quot; >

    <!–start content–>

    <head runat="server">

    Does this answer your question, its fairly vanilla we use standard CSS and JQuery as well, if that matters.

    So is there anything we can do or do you see anything in the Master Page top declaration we could or should change that won’t effect anything else?

    What would be the least unappealing work around.  This is actually a part of a big eCommerce site and we use ELMAH nad IIS logging so we are flooded with this error for IE 8 users.

    Thanks

  42. Sorry, eeyor145, but I’m pretty sure the XMLNS declaration ends up triggering a parser restart as well.

    In terms of "least unappealing workarounds"– I’m afraid I no longer have any.  I took a look at sticking a huge buffer comment just inside the HEAD tag to push the first script URL outside of the 4096th byte, but it looks (not surprisingly) like this impacts the 8192nd byte as well.

  43. John Martelli says:

    Thank you Eric for clarifying the issue for us. As stated by many others I am also requesting that a fix be made available as soon as possible. Most high volume sites log errors and we are being flooded with these .. making our logging a pain to  analyze.At the very least IE should include the "lookahead downloader" in the user agent string.

  44. Joe Rattz says:

    I am seeing the missing 4k bug about every two minutes on one of my sites.  Sure would like to see a fix.

    Thanks.

  45. MrMotts says:

    Would putting an IE conditional that re-states the base tag before each resource be considered standards-compliant or problematic?

    <!–[if IE 8]><base href="blah.com" /><![endif]–>

    <script type="text/javascript" language="javascript" src="foo/bar.js"></script>

    <!–[if IE 8]><base href="blah.com" /><![endif]–>

    <script type="text/javascript" language="javascript" src="foo/bar2.js"></script>

    <!–[if IE 8]><base href="blah.com" /><![endif]–>

    <link rel="stylesheet" href="foo/bar.css" type="text/css" />

  46. @MrMotts: That’s a pretty unappealing workaround, but if you were to do it, you’d definitely want to use a proper URL as the HREF.

  47. Bill says:

    We’re seeing this very frequently on three of our sites.  I’ve tried all the fixes that I can find suggestions for online, but nothing works.  Any update on where MS is with this would be huge.

  48. Ben S says:

    I’m getting this too. Come on MS, please give us a fix. I’m potentially missing out on real error notifications because of all the spam generated by this problem.

  49. Bill Rowell says:

    I just found an article that mentioned using this in the head of the HTML document to stop the IE8 issue:

    <meta http-equiv="X-UA-Compatible" content="IE=7" />

    I put that in for my 3 sites and it seems to be having a positive effect for now.  Is this a valid work around or should I stay clear of this?  Thanks!

  50. @Bill: Generally speaking, I would expect that to *worsen* the problem rather than making it better. Using the X-UA-Compatible flag likely causes a parser restart.

  51. Bill Rowell says:

    Interesting.  I’ll leave it be for now and see how often the error comes through.  I think I’ve only seen 1 since I put it in, but that could just be coincidence.

    Are there any other work arounds I could try?  I’ve tried the Content-Type fix among others.  None of them seemed to do the trick.

    Thanks Eric!

    -Bill

  52. JG says:

    Thought I’d chime in as well.  We have 2 sites all using ms ajax.  Not high volume.

    We are seeing the 4k bug pretty freqeuntly.  Like everybody else, have tried the workarounds suggested.

    No joy.  

    Out of desperation I am giving the last suggestion above (<meta http-equiv="X-UA-Compatible" content="IE=7" />) a blast.

    No data to report yet but we’ll see…

    Come on MS this has been going on for too long.

  53. tc says:

    Has anyone heard when or if a fix will be provided?

    thanks

  54. The BASE bug was fixed in today’s Cumulative IE8 Update. http://blogs.msdn.com/ieinternals/archive/2009/10/13/Using-Meddler-to-Simulate-HTTP.aspx

    @tc, to reiterate on your question about the 4k issue: "Unfortunately I’m not able to make any statements or speculations about IE code fixes (either availability or timeframe). I can say that this is an issue that we’re getting a significant amount of customer escalations about because the workarounds are unappealing."

  55. Jenn says:

    Our users are seeing the 4K bug – it’s causing 404 errors – that I cannot reproduce.  Please fix!  My client doesn’t want to hear that I can’t fix it and we’re losing business.

  56. EricLaw [MSFT] says:

    @Jenn: Unless your client is actively monitoring their network traffic, it’s unlikely that they are able to observe this problem.

    For instance, if they’re seeing 404 Error Pages, they’re not hitting this issue– your problem is elsewhere.

  57. JG says:

    We’ve been running our sites for a full day without any errors using this fix suggested by Bill Rowell.

    <meta http-equiv="X-UA-Compatible" content="IE=7" />

    Needs to be put before any linked files in the head.

    I guess that for some people this may have side-effects but it is working a treat for us.

  58. Paul says:

    Our site has always had this meta tag (first tag under the head tag):

    <meta http-equiv="X-UA-Compatible" content="IE=7" />

    The problem still continues.

  59. JG says:

    RE: <meta http-equiv="X-UA-Compatible" content="IE=7" />

    Okay a few more days have passed and we have had a few errors but it seems to have been vastly reduced.

    So this is not a fix but somehow mitigates the problem slightly.

  60. Thomas says:

    @JG: What you are seeing is probably some of your javascript references being pushed out of the problem area by the IE7 meta tag. This could be seen as a workaround but not a long term solution since the content in your pages changes anyway with future updates of your site.

    You can achieve the same result with an HTML-comment of the same length as your IE7 meta tag.

  61. AC says:

    We have had some code in the past that was moving ViewState to the end of the page form to help SEO.

    I’ve been applying the same to the WebResource and ScriptResource script tags and can move them to the end of the form to get them away from the 4096th bit but because they are now loaded after, say, some Ajax on the page the Ajax doesn’t work.

    Trying a few things to get this working but for anyone interested this is what we’re trying to do…

                   int startPoint2 = html.IndexOf("<script src="/WebResource.axd?");

                   if (startPoint2 >= 0)

                   {

                       int endPoint2 = html.IndexOf("</script>", startPoint2) + 9;

                       string webresourceInput = html.Substring(startPoint2, endPoint2 – startPoint2);

                       int formEndStart2 = html.IndexOf("</form>");

                       if (formEndStart2 >= 0)

                       {

                           html = html.Remove(startPoint2, endPoint2 – startPoint2);

                           html = html.Insert(formEndStart2 – (endPoint2 – startPoint2), webresourceInput);

                       }

                   }

  62. meta says:

    For the "Missing 4K bug", I used Response.AddHeader("Content-Type", "text/html; charset=utf-8") in the Page_Load event. But when I view source, I don’t see anything about Content-Type.

    Was I doing it right, or that is just expected?

    Thanks.

  63. Thomas says:

    @meta: You’re right, it’s the expected behavior. The Content-Type directive is added to the HTTP header which can be viewed with excellent tools such as Fiddler (written by the writer of this blog), Firebug or Live HTTP headers among others.

  64. John Martelli says:

    This does not work for us: we are still getting the same number of errors per day

  65. John Martelli says:

    <meta http-equiv="X-UA-Compatible" content="IE=7" /> does not work for us: we are still getting the same number of errors per day

  66. jeff says:

    Is there an easy way via Fiddler/whatever to figure out exactly where the 4096 byte is in the response?

    Thanks!

  67. EricLaw [MSFT] says:

    @Jeff: Sure thing.

    1> Update to the latest version of Fiddler.

    http://www.fiddler2.com/dl/fiddler2betasetup.exe

    2> For any given response, go to the HexView tab and scroll down until the status bar indicates "4096 (0x1000) of body"

  68. jeff says:

    Might be a stupid question, but is there an easy way to translate where you are in the hex view to where that corresponds to in the Raw/Text/whatever view?

  69. EricLaw [MSFT] says:

    @Jeff: Not really. The problem is that the other views interpret CRLF (end of line) as just that, while in the hex view, they’re rendered as encoded bytes.

  70. Gerard van de Ven says:

    We are constantly hitting this problem. We don’t use the “base” tag and also don’t have the META tag in our page and fiddler shows we have “Content-Type: text/html; charset=utf-8” so we must be hitting one of the other causes of this bug.

    Thanks for looking at it though. Hopefully you find a solution. As I am now putting in some special code into our exception handling mechanism to not be reported on this bug anymore as it is spamming my mailbox with exception emails.

  71. Chris says:

    Hi Eric,

    We are seeing this problem, but ONLY in our production environment.  I am unable to reproduce in any other environment.

    IIS configuration is identical on production and staging/dev.  The only difference is that Production is load balanced.  I’ve verified that the machineKey information is identical on both machines and session variable are stored in a Session database on SQL.

    Any thoughts?

  72. Jim T says:

    We are in the same situation as Chris.  Our production servers are load balanced.  We are significant number of 404 errors due to the bug.  

    Just like others here we cannot reproduce the bug on our developer servers.  

  73. Chris says:

    I also noticed that when I review my IIS logs, I find entries that dont necessarily match the date/time stamp of the error in the event log.  

    When I am able to find the errors, I see thats that these requests return a 200 status????

  74. Chris says:

    Jim T — are you using Google Campaign tracking by any chance?  Im seeing that most of my requests that contain this information is throwing the exception in the logs, but normal generic traffic to the site is not — Im hoping this explains why DEV/STAGE and Prod are showing 2 different behaviours

  75. EricLaw [MSFT] says:

    @Chris/JimT: As our developers investigated the problem, we found that there are a number of timing scenarios which can result in the buggy behavior seen when the parser restarts.

    Unfortunately, this means that in practice, there isn’t any viable workaround for web content that would resolve all forms of the problem.

  76. Mark D says:

    @Chris

    We are in the same situation.

    Load balanced servers in production, same machine keys across the web farm, and we see this conistently in production, but we HAVE also seen it in our QA/Staging environments.

    Very few times though compared to production.

    We have a very robust logging system in our application that captures all of these events, and are registered as many different types of errors:

    Invalid Viewstate

    Invalid character in a Base-64 string

    Invalid length for a Base-64 char array

    Length of the data to decrypt is invalid

    This is an invalid script resource request

    This is an invalid webresource request

    I actually reproduce the issue by chance myself, and attached to the debugger IE 8 provides, and found that a validator that’s required on a page was erroring out since the script.axd didn’t load properly.

    I wasn’t able to confirm that it eventually ‘did’ load, and that the error I saw was during the pre-parsing, but nevertheless, the error WAS customer facing and this issue is causing a serious headache for us.

    Eric,

    Any update on the status of a possible fix? Or do you have any suggestions for us?

    Could we insert a block "IF IE 8" that would push content past the span of bytes that cause the issue?

  77. EricLaw [MSFT] says:

    @Mark, as noted in the comments above, unfortunately I’m not able to make any statements or speculations about IE code fixes (either availability or timeframe). I can say that this is an issue that we’re getting a significant amount of customer escalations about because the workarounds are unappealing.

    In terms of "least unappealing workarounds"– I’m afraid I no longer have any.  I took a look at sticking a huge buffer comment just inside the HEAD tag to push the first script URL outside of the 4096th byte, but it looks (not surprisingly) like this impacts the 8192nd byte as well, and so on.

  78. Aaron says:

    We never really bothered with this since we’re using a powerful logger that is able to filter out these kinds of errors.

    Recently however we’ve been receiving reports like the one below:

    Requested URL: /TemplateModule/Web/Scripts/TemplateBaseContr.aspx?ID=6036&Action=Delete
    URL Referrer: http://XXXX/COMModule/ComViewArticle.aspx?MODULE=CONTENT&ID=6036

    It’s the same type of error but the requested URL is disturbing! Doesn’t this mean that the Lookahead downloader could form a valid request to a page that deletes stuff from the database without us knowing?

  79. @Aaron: Sure, it seems fairly unlikely, but it’s technically possible.

    A "valid" URL could be incorrectly retrieved if the pre-parser stream broke the first script URL at exactly the right byte and then skipped to continue the URL from exactly the right byte later in the page.

  80. Michal Zielinski says:

    Hello!

    I’m tracing this bug from beginning.

    It costed me plenty of time to get information about it.

    Doing test-applications to trying to recreate the bug or implementing some solutions read in forums and posts.

    First of all I want to thank Eric for his first helpful text concerning this issue. SERIOUSLY

    Now I have only to send the linked text to our customers (some of them get mailed error reports) and they believe me, that site users don’t experience any problems and thats not my responsibility / ability to fix it.

    Thanks Eric…

    since i spent many time on investigation, this error now saves investigation time, which is funny.

    some portals made hundreds of error mails a day.

    Since nobody is reading them anymore (e.g. our customers, and me), there is fewer investigation time on bugs compared to the time before this bug. (real bug reporting is lost somewhere in the huge amount of "placebo bugs"). Now we just react on errors reported through website users by phone or mail.

    The bug information is typicaly more precise.

    So one more time Eric I want to THANK YOU, to have courage on reporting some information on this issue provided by a authoritive source.

    But I’m wondering how this bug couldn’t be solved for about a year.

    I’m a developer and love MS for the .NET Framework and especially ASP.NET…

    I work with many MS tchnolgies/products, like MS-SQL, VS.NET or the office suite.

    In my opinion the mentioned apps are more complex then a browser. (Thats just my thoughts)

    the funny thing is, that since my development beginnings in 2000, no bug / problem in an MS Product hits me that much often and don’t have even a clear workaround

    What do these IE8 Dev-guys do the whole day? working on IE9 only…

    Or is there just one poor guy spending 90% of his time responding to emails and can only use 10% for developing..

    a solution seems so simply for me…

    e.G.

    if the "IE8’s Lookahead Downloader" feature have some malfunction, it could be disabled with the next IE8 Update

    OR

    In my understanding this feature starts preloading jsdata from url before document parsing is done.

    Maybe a simple check for a preceded and postpositioned " char (quotation and similar escape chars) would fix it for well formed html.

    OR

    at least a simple

    if not (in LookAheadUrl exists "webresource.axd" or "scriptresource.axd" ) then

    call DoLookAhead(LookAheadUrl)

    end if

    will help in my case (sounds like i’m an egoist).

    Maybe this solutions are stupid and not complex enough, but i got them at 4:00 AM after while writing this comment, after i found this blog during repetitive search for solution to this issue.

    By this chance i gain allways some other interesting information about things not connected to this one, so I don’t wasted my time..

    Thank you Eric and greetings from cologne, germany

  81. EricLaw [MSFT] says:

    >In my opinion the mentioned apps are more complex then a browser. (Thats just my thoughts)

    It’s really, really hard to measure complexity, but I wouldn’t assume that the other projects you mention are more complicated than the browser. If nothing else, all of the other types of software you mention can be implemented and run inside a browser. 🙂

  82. Dean says:

    We have some users experiencing a lockup in IE8 on a certain page, and that page also tends to be the one that shows up in our logs as having this issue.  Have you heard of any correlation?  I’m trying to find the cause of the lockups, and this is about the only thing I have left that isn’t eliminated.

  83. Brett J says:

    Eric, can we get an update confirmation that this bug is still not fixed?

    I’m surprised no one has added "Don’t let your users use IE8 because its going to generate lots of wasted requests" to the lists of how to optimize your web applications.

    My app gets about 60 of these errors a day (I receive an email about each and every one unfortunately).

    Had I known 6 months ago that this bug wasn’t going to be resolved (ever?!) perhaps I would have added a filter to my logging code to throw away errors with URLs of ScriptResource and WebResource at least.

  84. EricLaw [MSFT] says:

    BrettJ: Correct– there’s no publicly available fix for the 4k bug at this point in time.

    (FWIW, even in spite of this issue, IE8’s network performance will soundly trump that of IE7 in virtually every non-contrived case).

  85. EricLaw [MSFT] says:

    @Dean: There’s no known way for this issue to cause a hang of any sort. Does the page in question have any ActiveX content (e.g. Flash) on it?

  86. Dean says:

    No, no flash.  Our client having problems with IE freezing found that the problem doesn’t exist when they create a new user profile.  A test user having problems have wiped their profiles and created a new one, and are using it successfully.  However, this isn’t an acceptable solution for all their users.  We’re continuing to look for a cause.

  87. Michael says:

    Using fiddler, I can see the lookahead base tag bug.  I was thrilled to see a fix for this.  However, after trying to download Windows6.1-KB974455-x86.msu and install, I get…

    "The update is not applicable to your computer."

    I’m running Win7 Professional.  Please help..

    Mike

  88. @Michael: You may have this update installed already– if you’re up-to-date on patches for Windows, then it’s already installed.

    If not, the problem is probably that you’re trying to install the x86 package on a 64-bit computer. You should try to install the update via WindowsUpdate, and if you must install directly, make sure you install the proper bitness.

  89. Michael says:

    Thanks for the quick reply!  

    I have checked the patches on my machine using appwiz.cpl and it doesn’t show that I have 974455 installed.  

    I have spoken with my domain admin and WSUS 3.x tells him I don’t require this update.  

    Using windows updates, it does not offer 974455 and I don’t have any critical updates waiting to be installed.

    I have checked system properties and it shows that my system type is "32-bit operating system"

    The real issue is I’m still seeing many 404’s from what I assume is the Lookahead Downloader not using my base tag.

    Thanks again for the help…

    Mike

  90. @Michael: Sorry, it looks like the IE8+Win7 version of this fix hasn’t yet been released. The prior fix went out for XP/Vista. To answer the obvious question, sorry, no, I’m not allowed to make any statements about timelines for unreleased fixes, but I understand the urgency in getting this fixed.

  91. Michael says:

    @EricLaw:  Thanks for your help.  Is there anything I can do to help push the release?  Something on connect I and other users can vote up? Do you know when it is released, will be be part of a critical fix so everyone will get it?

    Thanks

    Mike

  92. EricLaw [MSFT] says:

    Note: The BASE issue was fixed for Windows 7 users in today’s cumulative update, so the IE8 BASE bug is now resolved on all platforms.

  93. Joe S. says:

    @EricLaw: Unfortunately the bug seems only to be fixed for base tag href’s that start with http:// – if the base tag points to a "file://" directory on the local computer still the "old" behaviour is there. If I change the compatibilty mode for that page to IE7 it works fine with a base tag that has a file:/// href but when switching to IE8 mode it does NOT work!

    When will this issue be fixed?

  94. EricLaw [MSFT] says:

    @Joe: No, you are seeing a different issue.

    Your FILE base tag is getting ignored because your FILE URI is invalid. Please .ZIP up a repro of the problem and email it to me (ericlaw@microsoft).

  95. Michael says:

    @EricLaw:  Not sure if you had anything to do with getting this pushed faster…  But it looks like it is fixed.  Thanks for your help.

    Mike

  96. Joe S. says:

    @EricLaw: I sent you the details by e-mail. If you place a base tag like this in the head of a document:

    <base href="file://C:/Users/User Name/Desktop/" />

    An image that is referenced like this <img src="test.png" /> will not be found. The IE8 (if set to IE8 Standards mode) will look for the image file in the same path as the HTML file. In other words the BASE tag is ignored. Actually it should find the image in on the Desktop in this case (same problem with any other path!). Also using file:///C|/Users/… does not help.

    Any help would be appreciated!

  97. Sam says:

    We also just started seeing these ViewState error in our ASP.NET application after we switched to new servers. Our codebase is same but just new server. And now we are not able to figure out if the new server has some new patches or old servers had some patches prevention from this issue. But we use to have same ASP.NET application use to work in old servers which are throwing these ViewState byte 4K issue in new servers.

    Can anyone please put some light if there are some server patch that will prevent from this issue?

    Thanks.

  98. EricLaw [MSFT] says:

    @Sam: If your ViewState errors are caused by the 4k bug, then no, there’s no server patch that will prevent that client bug.

  99. Sam says:

    @Eric: This is what surprised me as well. Here we all are discussing that it is client side issue and it must be client side but we started seeing these errors only when we switched to new servers. We are having this same exact issue with ScriptResource.axd file with invalid "d" parameter causing to Invalid ViewState error. and its happending at 4K byte position of the http response. However, this does not happen all the time for every single request. Its just intermittent.

    So, I was expecting there should be some server level fix or configuration as we started seeing this issue after we moved to new server with same ASP.NET codebase.

    Only difference so far I have seen between our new servers and old are in machine.config settings.

    Our new servers were missing these machine.config:

    <system.net>

       <connectionManagement>

         <add address="*" maxconnection="48"/>

       </connectionManagement>

    </system.net>

    <processModel maxWorkerThreads="100" maxIoThreads="100" minWorkerThreads="50"/>

    We use to have above configuration with our old servers. However, we haven’t applied these changes to the new servers as we are still in process of investigating this issue.

    So, does processModel thread numbers and connectionManagement connection in some how cause this issue? Is there a possibility that because of low resource allocation in server side could cause this issue?

  100. EricLaw [MSFT] says:

    @Sam: As discussed above, the issue is timing related. If the bytes are received by the client with different timings, you’ll potentially see different results. However, you don’t really have control over this from the server because you don’t control the full path to the client.

  101. Sam says:

    Thanks Eric.

    I know we don’t have control over timing and client but just wanted to mention in this thread that we are having same issue just because we switched to new server. So, I don’t think it has to do anything with ASP.NET code. We have same codebase that use to work with old servers which are not working in new servers and throwing this Invalid ViewState error.

  102. Just to reiterate for folks who aren’t reading the comment thread:

    Unfortunately I’m not able to make any statements or speculations about IE code fixes (either availability or timeframe). I can say that this is an issue that we’re getting a significant amount of customer escalations about because the workarounds are unappealing.

  103. @Cindy: The problem you describe doesn’t sound likely to be related to the lookahead issue in any way.

  104. user says:

    Since we switched (on Apache server with perl CGI) our charset from iso-8859-1 to utf-8, we have been flooded

    with damaged requests in the server’s errorlog.

    We analyzed the problem for several days. Nearly all requests come from IE 8, a few from IE 7. As I understand now, these are probably IE 8 in compatibility mode reporting user agent IE 7.

    The requests in our served pages are for .js files or .css files, mainly.

    they sit in the html head section, near the beginning of the page, after some meta tags (description, keywords etc).

    These resource links were ‘trashed’, i.e. overwritten with content from the html page which was place exactly 4096 bytes further down in the page.

    We serve a few million of page requests per day, about 25% to IE8 clients.

    We had a few hundred such trashed requests.

    We have had some success in reducing the number of trashed requests by moving the requests closer to the beginning of the page, before the meta tags. There, they were hit less often by the ‘drop 4k bug’.

    Tonight I found this blog via google, and I removed the line

    meta http-equiv=”Content-Type” content=”text/html; charset=utf-8

    from the html pages.

    We server correct charset info in the http-header anyway.

    Since I removed this line, about two hour ago, the trashed requests have stopped to appear in the logfile.

  105. figz says:

    Microsoft’s comment on Bug 467062 (https://connect.microsoft.com/IE/feedback/details/467062/bug-ie8-4k-dropped-invalid-viewstate-when-loading-scriptresource-axd-or-webresource-axd-asp-net)

    ————–

    Thank you for submitting your feedback on Internet Explorer 8. The information in this bug has been reviewed carefully by the team and will help inform future releases of Internet Explorer. We are now closing down the Internet Explorer 8 Feedback Program in preparation for soliciting feedback for our next version and as such, per the Microsoft Connect guidelines, all remaining bugs for IE8 will be resolved as “Postponed” for now. When the feedback program for the next release is ready, we will look at transferring this bug, along with any applicable status update.

    Thanks again for all your support and feedback.

    The IE Team

    ————–

    Does this mean they are starting IE9 and have stopped patching IE8?

    Thanks,

    Eric

  106. EricLaw [MSFT] says:

    @figz: No. While it’s true that IE9 development is well underway, Microsoft continues to issue updates for browsers for ~10 years after they are released.

  107. Paul says:

    Just to inform you that some of us (see URL submitted with this post) encouter that bug is preventing some of us to use the Silverligth interfaces of SharePoint 2010 on XP SP3 client machines.

  108. I’ve updated this bug to reflect that both Lookahead Downloader issues are fixed.

    Note: There’s still an active bug on the BASE element failing to be respected when the containing page runs in Standards mode and the BASE URI specifies a HREF that uses the FILE:// protocol.

    This is totally unrelated to the preparser issue discussed in this post.

  109. Mike says:

    Any idea when we can expect a patch for the base href/file URI bug Eric ?

  110. Just to reiterate for folks who aren’t reading the comment thread above:

    Unfortunately I’m not able to make any statements or speculations about IE code fixes (either availability or timeframe).

  111. Luke Ahead says:

    Just one question… Which version of IE is this fixed as of?  What KB etc do we need to tell our clients to install?

  112. Luke Ahead says:

    Ah, finally found your other post regarding the IE8 cumulative update.

  113. EricLaw [MSFT] says:

    @Luke: The bug only existed in IE8, and was fixed in the 3/30 Cumulative Update. Of course, clients should be installing all Cumulative Updates (which are, well, cumulative), but the 3/30 update was KB980182 and until June, that’s the latest cumulative update.

  114. Kenza says:

    Thanks Eric for continually keeping us up to date with regards to the 4k issue. You are a star!

    I noticed the following link is no longer accessible. Has the URL been updated?

    "https://connect.microsoft.com/IE/feedback/details/467062/bug-ie8-4k-dropped-invalid-viewstate-when-loading-scriptresource-axd-or-webresource-axd-asp-net&quot;

  115. EricLaw [MSFT] says:

    @Kenza: Nope, that URL still works fine for me. I’ve heard a few reports from folks that have had problems viewing reports on CONNECT.

    FWIW: There’s no useful data in the CONNECT bug that you won’t find here.

  116. Andreas says:

    I am using MSIE 9 and encountered exactly the same bug. The difference to the described scenario is only, that the whole document is inside an iframe. Any ideas why this isn't fixed?

Skip to main content