Setting up a Reverse Proxy using IIS, URL Rewrite and ARR

Today there was a question in the IIS.net Forums asking how to expose two different Internet sites from another site making them look like if they were subdirectories in the main site.

So for example the goal was to have a site: www.site.com expose a www.site.com/company1  and a www.site.com/company2 and have the content from “www.company1.com” served for the first one and “www.company2.com” served in the second one. Furthermore we would like to have the responses cached in the server for performance reasons. The following image shows a simple diagram of this:

Reverse Proxy Sample 

This sounds easy since its just about routing or proxying every single request to the correct servers, right? Wrong!!! If it only it was that easy. Turns out the most challenging thing is that in this case we are modifying the structure of the underlying URLs and the original layout in the servers which makes relative paths break and of course images, Stylesheets (css), javascripts and other resources are not shown correctly.

To try to clarify this, imagine that a user requests using his browser the page at https://www.site.com/company1/default.aspx, and so based on the specification above the request is proxied/routed to https://www.company1.com/default.aspx on the server-side. So far so good, however, imagine that the markup returned by this HTML turns out to have an image tag like “ <img src=/some-image.png /> ”, well the problem is that now the browser will resolve that relative path using the base path on the original request he made which was https://www.site.com/company1/default.aspx resulting in a request for the image at https://www.site.com/some-image.png instead of the right “company1” folder that would be https://www.site.com/company1/some-image.png .

Do you see it? Basically the problem is that any relative path or for that matter absolute paths as well need to be translated to the new URL structure imposed by the original goal.

So how do we do it then?

Enter URL Rewrite 2.0 and Application Request Routing

URL Rewrite 2.0 includes the ability to rewrite the content of a response as it is getting served back to the client which will allow us to rewrite those links without having to touch the actual application.

Software Required:

Steps

  1. The first thing you need to do is enable Proxy support in ARR.
    1. To do that just launch IIS Manager and click the server node in the tree view.
    2. Double click the “Application Request Routing Cache” icon
    3. Select the “Server Proxy Settings…” task in the Actions panel
    4. And Make sure that “Enable Proxy” checkbox is marked. What this will do is allow any request in the server that is rewritten to a server that is not the local machine will be routed to the right place automatically without any further configuration.
  2. Configure URL Rewrite to route the right folders and their requests to the right site. But rather than bothering you with UI steps I will show you the configuration and then explain step by step what each piece is doing.
  3. Note that for this post I will only take care of Company1, but you can imagine the same steps apply for Company2, and to test this you can just save the configuration file below as web.config and save it in your inetpub\wwwroot\  or in any other site root and you can test it.

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>
        <rewrite>
            <rules>
                <rule name="Route the requests for Company1" stopProcessing="true">
                    <match url="^company1/(.*)" />
<conditions>
                        <add input="{CACHE_URL}" pattern="^(https?)://" />
</conditions>
                    <action type="Rewrite" url="{C:1}://www.company1.com/{R:1}" />
<serverVariables>
                        <set name="HTTP_ACCEPT_ENCODING" value="" />
</serverVariables>
                </rule>
            </rules>
            <outboundRules>
                <rule name="ReverseProxyOutboundRule1" preCondition="ResponseIsHtml1">
                    <match filterByTags="A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script" pattern="^http(s)?://www.company1.com/(.*)" />
<action type="Rewrite" value="/company1/{R:2}" />
</rule>
                <rule name="RewriteRelativePaths" preCondition="ResponseIsHtml1">
                    <match filterByTags="A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script" pattern="^/(.*)" negate="false" />
<action type="Rewrite" value="/company1/{R:1}" />
</rule>
                <preConditions>
                    <preCondition name="ResponseIsHtml1">
                        <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
</preCondition>
                </preConditions>
            </outboundRules>
        </rewrite>
    </system.webServer>
</configuration>

Setup the Routing

                <rule name="Route the requests for Company1" stopProcessing="true">
                    <match url="^company1/(.*)" />
<conditions>
                        <add input="{CACHE_URL}" pattern="^(https?)://" />
</conditions>
                    <action type="Rewrite" url="{C:1}://www.company1.com/{R:1}" />
<serverVariables>
                        <set name="HTTP_ACCEPT_ENCODING" value="" />
</serverVariables>
                </rule>

The first rule is an inbound rewrite rule that basically captures all the requests to the root folder /company1/*, so if using Default Web Site, anything going to https://localhost/company1/* will be matched by this rule and it will rewrite it to www.company1.com respecting the HTTP vs HTTPS traffic.

One thing to highlight which is what took me a bit of time is the “serverVariables” entry in that rule that basically is overwriting the Accept-Encoding header, the reason I do this is because if you do not remove that header then the response will likely be compressed (Gzip or deflate) and Output Rewriting is not supported on that case, and you will end up with an error message like:

HTTP Error 500.52 - URL Rewrite Module Error.
Outbound rewrite rules cannot be applied when the content of the HTTP response is encoded ("gzip").

Also note that to be able to use this feature for security reasons you need to explicitly enable this by allowing the server variable. See enabling server variables here.

 

The last two rules just rewrite the links and scripts and other resources so that the URLs are translated to the right structure. The first one rewrites absolute paths, and the last one rewrites the relative paths. Note that if you use relative paths using “..” this will not work, but you can easily fix the rule above, I was too lazy to do that and since I never use those when I create a site it works for me :)

Setting up Caching for ARR

A huge added value of using ARR is that now we can with a couple of clicks enable disk caching so that the requests are cached locally in the www.site.com, so that not every single request ends up paying the price to go to the backend servers.

  1. To do that just launch IIS Manager and click the server node in the tree view.
  2. Double click the “Application Request Routing Cache” icon
  3. Select the “Add Drive…” task in the Actions panel.
  4. Specify a directory where you want to keep your cache. Note that this can be any subfolder in your system.
  5. Make sure that “Enable Disk Cache” checkbox is marked in the Server Proxy Settings mentioned above.

As easy as that now you will see caching working and your site will act as a container of other servers in the internet. Pretty cool hah! :)

So in this post we saw how with literally few lines of XML, URL Rewrite and ARR we were able to enable a proxy/routing scenario with the ability to rewrite links and furthermore with caching support.