Setting up a Reverse Proxy using IIS, URL Rewrite and ARR


Today there was a question in the IIS.net Forums asking how to expose two different Internet sites from another site making them look like if they were subdirectories in the main site.

So for example the goal was to have a site: www.site.com expose a www.site.com/company1  and a www.site.com/company2 and have the content from “www.company1.com” served for the first one and “www.company2.com” served in the second one. Furthermore we would like to have the responses cached in the server for performance reasons. The following image shows a simple diagram of this:

Reverse Proxy Sample 

This sounds easy since its just about routing or proxying every single request to the correct servers, right? Wrong!!! If it only it was that easy. Turns out the most challenging thing is that in this case we are modifying the structure of the underlying URLs and the original layout in the servers which makes relative paths break and of course images, Stylesheets (css), javascripts and other resources are not shown correctly.

To try to clarify this, imagine that a user requests using his browser the page at http://www.site.com/company1/default.aspx, and so based on the specification above the request is proxied/routed to http://www.company1.com/default.aspx on the server-side. So far so good, however, imagine that the markup returned by this HTML turns out to have an image tag like “<img src=/some-image.png />”, well the problem is that now the browser will resolve that relative path using the base path on the original request he made which was http://www.site.com/company1/default.aspx resulting in a request for the image at http://www.site.com/some-image.png instead of the right “company1” folder that would be http://www.site.com/company1/some-image.png .

Do you see it? Basically the problem is that any relative path or for that matter absolute paths as well need to be translated to the new URL structure imposed by the original goal.

So how do we do it then?

Enter URL Rewrite 2.0 and Application Request Routing

URL Rewrite 2.0 includes the ability to rewrite the content of a response as it is getting served back to the client which will allow us to rewrite those links without having to touch the actual application.

Software Required:


Steps

  1. The first thing you need to do is enable Proxy support in ARR.
    1. To do that just launch IIS Manager and click the server node in the tree view.
    2. Double click the “Application Request Routing Cache” icon
    3. Select the “Server Proxy Settings…” task in the Actions panel
    4. And Make sure that “Enable Proxy” checkbox is marked. What this will do is allow any request in the server that is rewritten to a server that is not the local machine will be routed to the right place automatically without any further configuration.
  2. Configure URL Rewrite to route the right folders and their requests to the right site. But rather than bothering you with UI steps I will show you the configuration and then explain step by step what each piece is doing.
  3. Note that for this post I will only take care of Company1, but you can imagine the same steps apply for Company2, and to test this you can just save the configuration file below as web.config and save it in your inetpub\wwwroot\  or in any other site root and you can test it.
<?xml version=”1.0″ encoding=”UTF-8″?>
<configuration>
   
<system.webServer>
       
<rewrite>
           
<rules>
               
<rule name=”Route the requests for Company1″ stopProcessing=”true”>
                   
<match url=”^company1/(.*)” />
                    <
conditions>
                       
<add input=”{CACHE_URL}” pattern=”^(https?)://” />
                    </
conditions>
                   
<action type=”Rewrite” url=”{C:1}://www.company1.com/{R:1}” />
                    <
serverVariables>
                       
<set name=”HTTP_ACCEPT_ENCODING” value=”” />
                    </
serverVariables>
               
</rule>
           
</rules>
           
<outboundRules>
               
<rule name=”ReverseProxyOutboundRule1″ preCondition=”ResponseIsHtml1″>
                   
<match filterByTags=”A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script” pattern=”^http(s)?://www.company1.com/(.*)” />
                    <
action type=”Rewrite” value=”/company1/{R:2}” />
                </
rule>
               
<rule name=”RewriteRelativePaths” preCondition=”ResponseIsHtml1″>
                   
<match filterByTags=”A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script” pattern=”^/(.*)” negate=”false” />
                    <
action type=”Rewrite” value=”/company1/{R:1}” />
                </
rule>
               
<preConditions>
                   
<preCondition name=”ResponseIsHtml1″>
                       
<add input=”{RESPONSE_CONTENT_TYPE}” pattern=”^text/html” />
                    </
preCondition>
               
</preConditions>
           
</outboundRules>
       
</rewrite>
   
</system.webServer>
</configuration>

Setup the Routing

                <rule name=”Route the requests for Company1″ stopProcessing=”true”>
                   
<match url=”^company1/(.*)” />
                    <
conditions>
                       
<add input=”{CACHE_URL}” pattern=”^(https?)://” />
                    </
conditions>
                   
<action type=”Rewrite” url=”{C:1}://www.company1.com/{R:1}” />
                    <
serverVariables>
                       
<set name=”HTTP_ACCEPT_ENCODING” value=”” />
                    </
serverVariables>
               
</rule>

The first rule is an inbound rewrite rule that basically captures all the requests to the root folder /company1/*, so if using Default Web Site, anything going to http://localhost/company1/* will be matched by this rule and it will rewrite it to www.company1.com respecting the HTTP vs HTTPS traffic.

One thing to highlight which is what took me a bit of time is the “serverVariables” entry in that rule that basically is overwriting the Accept-Encoding header, the reason I do this is because if you do not remove that header then the response will likely be compressed (Gzip or deflate) and Output Rewriting is not supported on that case, and you will end up with an error message like:

HTTP Error 500.52 – URL Rewrite Module Error.
Outbound rewrite rules cannot be applied when the content of the HTTP response is encoded (“gzip”).

Also note that to be able to use this feature for security reasons you need to explicitly enable this by allowing the server variable. See enabling server variables here.

 

Outbound Rewriting to fix the Links

The last two rules just rewrite the links and scripts and other resources so that the URLs are translated to the right structure. The first one rewrites absolute paths, and the last one rewrites the relative paths. Note that if you use relative paths using “..” this will not work, but you can easily fix the rule above, I was too lazy to do that and since I never use those when I create a site it works for me 🙂

Setting up Caching for ARR

A huge added value of using ARR is that now we can with a couple of clicks enable disk caching so that the requests are cached locally in the www.site.com, so that not every single request ends up paying the price to go to the backend servers.

  1. To do that just launch IIS Manager and click the server node in the tree view.
  2. Double click the “Application Request Routing Cache” icon
  3. Select the “Add Drive…” task in the Actions panel.
  4. Specify a directory where you want to keep your cache. Note that this can be any subfolder in your system.
  5. Make sure that “Enable Disk Cache” checkbox is marked in the Server Proxy Settings mentioned above.

As easy as that now you will see caching working and your site will act as a container of other servers in the internet. Pretty cool hah! 🙂

So in this post we saw how with literally few lines of XML, URL Rewrite and ARR we were able to enable a proxy/routing scenario with the ability to rewrite links and furthermore with caching support.

Comments (31)

  1. Anonymous says:

    CarlosAg, your instructions are on the money but I have a question. In your re-write rule, the action to rewrite for qualifying a website such as "www.google.com/" works like a charm but why if the url was "www.google.com/user/login.jsp" instead?

    I am very new at this and need help resolving this error.

    Thanks,

    Musillah

  2. Anonymous says:

    simple examples of proxying to a web site are all over. I can't find ONE single example for the condition above my post and rewriting to a server/FOLDER/file.ext instead of just to another server.

    A lot of people are asking but no answers seem to be availble. When anyone asks this question, all the "experst" dissappear. Can someone help or is this a representation of the crappy world of windows? I can do this in apache on linux with TWO LINES. No wonder micorosft sucks ….

  3. Anonymous says:

    You need to add a mime type for the extension in iis.  It works fine then.

  4. Anonymous says:

    Hi

    How do we make sure the outbound rule will apply only to the request that are rewritten if i have multiple things on the same site.

    <match filterByTags="A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script" pattern="^/(.*)" negate="false" />

    The above rule is capturing every response.

    Thanks for your time.

    1. Johnny says:

      I’m wondering the same thing as Bala. How do we scope the outbound rule to to only those responses for requests that were rewritten. As it stands now, all responses will be inspected.

  5. Anonymous says:

    I wonder if it would be simpler to key of Referer..

  6. Anonymous says:

    Hello and thanks for the article, I have a question

    Does ARR work with back end server other than IIS, like Appache, SuneOne, Domino  …etc ..?

    Thanks

    Louay  

  7. CarlosAg says:

    @LouayA-CA , Yes it will work against any HTTP server.

  8. Anonymous says:

    Thanks for the answer! I really appreciate if you can share the Microsoft link for the IIS7 suport matrix

    Thanks!

    Louay

  9. Anonymous says:

    This works great for everything within the html of the page served but it doesn't handle the links in files that aren't html, in my case the image links in my css are broken. Any way around this?

  10. Anonymous says:

    Left this comment on the IIS blog, but Carlos could you post an example of how you'd identify and rewrite relative paths?  Say they have something like src = ../images/image.jpg or  ./scripts/script.js?

    Thanks,

    Jess

  11. Anonymous says:

    Hi,

    When we add a server to farm, we get a pop up asking for automatic setting up of redirection. What does this option do?

    Thanks.

  12. Anonymous says:

    I have IIS 7 with AAR and Apache Tomcat behind it.  All is operating fine but Apache is not getting the IP info from the client hits on the IIS front end.  Can anyone help me with getting IIS to pass the IP to TOmcat?

  13. CarlosAg says:

    @Paul, you might want to check this video that talks about how to achieve that with ARR (in the backend using ARR Helper) weblogs.asp.net/…/arr-helper-week-33.aspx

  14. Anonymous says:

    hello sir i moved my hosting from php to share point i need to redirect a request for a specifc  page to a new server an example as below can you please specify the rule? appreciate your feedback

    (/en/my-account-mobile.php) to ( http://new server.com )

  15. Anonymous says:

    Hello,

    Does anyone know why I keep getting Permission Denied trying to ad the rules of my WHS 2011 box..?

    Im running the GUI as Admin and logged in as Administrator… Trying to add the to the Default Web Site?

    Thanks

  16. Anonymous says:

    Hi,

    I have adapted your web.config file (dl.dropbox.com/…/web.config), for a generic multi-purpose reverse proxy (http://localhost/proxy/www.microsoft.com/en-US/default.aspx) works like many other domain names. Is that correct? What can be done to improve it (e.g support 302/301 redirects)?

    Thanks in advance,

    Jesse

  17. Anonymous says:

    Also, the HTTP_ACCEPT_ENCODING does not seem to hold on posts and I get the 500.52 gzip error.

    And last question: what would be needed to force https traffic between the client and IIS, without altering the targeted traffic?

    Cheers

  18. Anonymous says:

    Is there a way to prevent internal ip exposure? I'm using a reverse proxy for access to OWA but the internal ip gets exposed in the login dialog window.

    IE: The server http://www.XXXXXX.com at int.ern.al.ip requires a username and password.

    Chrome: The server http://www.XXXXXX.com requires a username and password. Server message: int.ern.al.ip

  19. Anonymous says:

    Hi thanks for the detailed post. I have tried this. The routing working fine for me if the destination url is in the same machine of the ARR. But it is not working if the destination url is an external one. For example the blow configuration is not worked for me

    <?xml version="1.0" encoding="UTF-8"?>

    <configuration>

    <system.webServer>

    <rewrite>

    <rules>

    <rule name="or_rule_1" enabled="true">

    <match url=".*" />

    <action type="Rewrite" url="http://www.cnn.com&quot; />

    </rule>

    </rules>

    </rewrite>

    </system.webServer>

    </configuration>

    Also the caching was not worked. Do we need to do any other setups for this?

  20. Anonymous says:

    hi,

    i have a apache behind the iis,and my php website runs on apache.but when i use $_SERVER['SERVER_NAME'] and $_SERVER['HTTP_HOST']  can not get the correct in a php file,it responses  "localhost".

    i alse set SERVER_NAME and HTTP_HOST what i want to set in web.config file.

    How can i do ?

  21. Anonymous says:

    Your example shows two subfolders of a public IP being redirected to two separate internal hosts, but is there a way, using host headers or another similar concept, to redirect two URLs (and subfolders) such as:

    1:  hostname1.domain.com/*   ->  InternalHost.internaldomain.local/hostname1/*

    2: hostname2.domain.com/*  ->  internalhost2.internaldomain.local/applicationname/*

    The concept is that I want to overload the one external IP, but redirect to various hosts (windows, Linux, or microcontroller, etc) on the inside.  I can't seem to get it to work with anything I've tried extrapolating from your instructions.  

    Thanks!

    Steve

  22. Anonymous says:

    Hello Carlos,

    Thank you very much for your advice. I followed it and successfully exposed internal http based audio streams to the public. However I have a problem.  IIS7 seems to be heavily buffering the output of the reverse proxy. In my case, it is taking around 5 minutes for the redirected audio stream to begin playing on the client.  Do you happen to know if there's a setting I can adjust to reduce this delay?   Thank you very much in advance!

  23. Anonymous says:

    Hi,

    We are using reverse proxy for one of our internal application. The issue is when on this application (mvc.net) I try to submit a form I get an empty page. Is it because it is trying to post to the main application

  24. Anonymous says:

    I have checked the box to 'enable proxy' in ARR, and added the following rule to the top of my <rewrite> section in my site webconfig (this is a site i have added, not the default site)

    <rule name="Route the requests for blogtest" stopProcessing="true">

                       <match url="^blogtest/(.*)" />

                       <conditions>

                           <add input="{CACHE_URL}" pattern="^(https?)://" />

                       </conditions>

                       <action type="Rewrite" url="{C:1}://www.externalblog.com/{R:1}" />

                       <serverVariables>

                           <set name="HTTP_ACCEPT_ENCODING" value="" />

                       </serverVariables>

                   </rule>

    However when I visit http://www.mysite.com/blogtest or anything inside blogtest I simply get our 404 error page. If I switch the action type from rewrite to redirect, the rule kicks in and redirects me to the external site properly.

  25. Anonymous says:

    Hi Carlos,

    This blogs is help me a lot for setting up ARR. I have a minor issue, When i tried to open excel or docs. It is trying to open from ARR server instead of company1. if i keeps cancel 3 x eventually it will open the documents. Please help.

  26. Anonymous says:

    Same issue as Eric in post from 12th of Feb.

    If the url is  http://…../mysite/index.php  it works fine.

    But not if   http;//…./mysite/

    I'm guessing the regular expression only works if there is something after the slash?

  27. Anonymous says:

    I ended up tweaking the expression a little:

    <match url="^mysite/?(.*)" />

  28. Anonymous says:

    Hi, I have tried to follow your step to setup. However, the ARR Proxy seems not functioning.

    a) i tried setup a rules, wildcard match *iisstart.htm* , tried with url : http://localhost/iisstart.htm .. wont redirect, instead hit the iisstart page.

    b) i tried wildcard match *service1.svc*, tried with url: http://localhost/service1.svc wont redirect, giving me 404 error stating MapRequestHandler stated resource not found.

    Please help.

  29. Configure Drupal/web.config to run behind a reverse proxy server to get real IP addr? says:

    Hello,

    I have a Drupal 7 site on a (godaddy) shared server.

    It's based on IIS 7 and it means (as far as I know) IIS ignores setting.php and we have to use web.config

    My goal: I need to get the (real) IP address of the client machine.

    The functions I tested (including ip_address()) are returning me the value of "REMOTE_ADDR" when in my scenario the real IP address of the user is in the the X-Forwarded-For header

    The solution in case of using setting.php is this:

    // Tell Drupal that we are behind a reverse proxy server $conf['reverse_proxy'] = TRUE; // List of trusted IPs (IP numbers of our reverse proxies) $conf['reverse_proxy_addresses'] = array( '127.0.0.1', );

    Then my question is: How can we make these changes/settings using IIS/web.config file?

    I tried to create a rule (web.config) to rewrite the value of REMOTE_ADDR with the X-Forwarded-Fo but It doesn't work for me. REMOTE_ADDR is not taking the real IP (X-Forwarded-Fo) of the user. Maybe I need to add something to override this value, before "run" this rule?

    I really appreciate any help with this.

    Regards,

    Leo.

  30. Any ideas why preserverHostHeaders option is set to False by default? Output rules are a terribly clunky way to fix URLs in links and images.