Loading XML Files from Behind a Proxy Server


I recently got a number of bug reports that in certain situations RSS Bandit would report a proxy authentication error when fetching certain RSS feeds over the Web when connecting through a proxy server. It seemed most feeds would work fine but a particular set of feeds would result in the following message



The remote server returned an error: (407) Proxy Authentication Required.


Examples of sites that had problems include the feeds for Today on Java.netMartin Fowler’s bliki and Wired News. It dawned on me that the one the one thing all these feeds had in common was that they referenced a DTD. The problem was that although I was using an instance of the System.Net.IWebProxy interface in combination with an HttpWebRequest when fetching the RSS feed I did not provide the XmlValidatingReader used to process the feed that it should use the proxy information when resolving DTDs.  


This is where things got less intuitive. All XmlReaders have an XmlResolver property used to retrieve resources external to the file. However the XmlResolver class does not provide a way to specify proxy information, only authenticattion information. To solve this problem I had to create a subclass of the XmlResolver class which used the proxy connection when retrieving external resources. It seems I’m not the only person who’s come up across this problem and the solution was presented on the microsoft.public.dotnet.xml newsgroup a while ago in the thread entitled XmlValidatingReader, XmlResolver, Proxy Authentication, Credentials, Remote schema. This post shows how to create a custom XmlResolver which utilizes proxy information and how to use this class to prevent the errors I was seeing.


I checked in the fix to RSS Bandit this morning, so very soon a number of users of the most sophisticated news aggregator on the Windows platform will be very happy campers seeing this annoying bug fixed.  


Comments (7)

  1. Stephane Rodriguez says:

    Funny, that’s exactly why I think there is so much more that .NET run-times have to do before it becomes a reliable platform with built-in "executability".

    You can be confident that your software works fine, and indeed you can pass it to anyone around in your network company, and they all respond how great your app is. And then, you make a package and start distributing, that’s when troubles begin.

  2. Stephane,

    This is a problem with all software development. If I had tested the software exhaustively from behind a proxy connection I would have caught this problem.

    In most cases the application does work behind a proxy since HttpWebRequest uses the proxy settings from Internet Explorer by default but in the cases where they are incorrect (e.g. the user actually uses another browser so IE proxy info isn’t set) then probems occur.

  3. Stephane Rodriguez says:

    "since HttpWebRequest uses the proxy settings from Internet Explorer by default"

    This design is highly questionable. Why not require developers to assign a proxy property before the xxxrequest instance can be used? This would not solve problems per se, but at least show people some direction.

    In addition, I don’t see much code snippet of such around, which is sad when you know this is mandatory.

    Again, software development so far has been about controlling what was going on, not making big bets.

  4. It’s always easy to be an armchair critic. Why should an object require that you set a property that is rarely used before the class can be utilized? What would end up happening is that people would often set that to some dummy value and it would become common for people to cut & paste that around.

    Anyway, the real problem is that XmlUrlResolver doesn’t have a way for you to set the proxy property. I talked to the PM who owns the class and we’ll see if there’s a way to fit this in for Whidbey.

  5. Stephane Rodriguez says:

    "Why should an object require that you set a property that is rarely used before the class can be utilized?"

    I think proxy settings are not rarely used. Quite the opposite, you won’t find a single professional internet-related software without versatile proxy options. There is a good reason of that, it’s impossible to have a program run otherwise. Write stuff at home, test at work : does not work. Write stuff at work, doesn’t work at home. How many times did this happen to you? Never? Ho.

    Also, stop thinking that everything revolves around Internet Explorer options. How do you UDP-based Real player work?

    "Anyway, the real problem is that XmlUrlResolver doesn’t have a way for you to set the proxy property. I talked to the PM who owns the class and we’ll see if there’s a way to fit this in for Whidbey."

    Thanks. This and a little code snippet in MSDN in the appropriate place, it’s all fine.