Child URL Execution and SSI #exec, Redux


Ok, I now have gotten questions about SSI #exec behavior on IIS6 for both ISAPI (ASP) and CGI (Perl) resources, so here is the consolidation of it all…


Question:


Hi David,


I have searched Google to help me figure out a problem I am having, and every search result I get brings up pages on your site. Unfortunately, I haven’t found a solution, although your site helped me isolate the problem.


The cgi I am having trouble with is written in Perl and uses the CGI perl module with the get() function. The cgi is used as a server side include (exec method). It is a pagehits type utility. It uses the cgi get() function to get the url of the page that the cgi was included in. Pages using the cgi served under Windows 2000 with IIS 5 worked fine for the past 4 years, but now I am trying to migrate the cgi to Windows Server 2003 and IIS 6 and it is not working properly. Something changed in IIS 6 that is preventing the server side include cgi to get the proper string using the cgi get() function. The string returned is that of the cgi rather than the page the cgi is included in. I have tried making IIS 6 run in IIS5 isolation mode and it still doesn’t work.


Could you provide any insight to why this is happening (I assume stronger security) and if there is anyway I could get it to work?


Answer:


The answer to your question is actually in a recent blog post of mine.


I did some searching and have note found any public specifications about the proper environment used for #exec … but if you do find a specification that would alter our decisions for #exec, please do post a comment and let me know.


Basically, SSI #exec on IIS relies on “child URL execution” for functionality. In IIS5, this ability was hacked into the core and resulted in the behavior you observed – that child URL inherited the server variable environment of the parent. In IIS6, this ability became a first-class concept and exposed to ISAPI Extensions, and since its behavior is intrinsically tied to server implementation, its behavior is also different – child URLs now have their own scope and has no idea of parent URL.


The net-effect of the change is as you observed – your CGI probably used the “URL” server variable to obtain the parent URL and maintain its count. This worked on IIS5 because #exec gave the parent’s server variable environment. This does not work on IIS6 because #exec gives the child URL its own server variable environment.


There is no way to configure nor modify this behavior since it is intrinsic to how IIS6 server core works.


One way to work with this on IIS6 is to modify your CGI to take in a querystring or other parameter from the URL, and for SHTML pages containing #exec statements to denote the exact URL whose page-counter you want to increment.


//David

Comments (13)

  1. Jon says:

    Hi David,

    Thank you for the great reply. Unfortunately, I was told by upper management that we are not allowed to modify the way the cgi is called within the ssi exec statement of the shtm files (thousands of pages). This would require our upgrade project to get committee approvals and cause end users to get upset etc. Is there any way to use ISAPI to detect the cgi ssi call in a web page (filter text) and grab the current environment (including URL) and rewrite the page with the hit count data (using logic to retrieve and store values based on the URL)?

    If this is not possible, is it possible to run IIS 5 on Windows 2003? 😉

    Thanks again,

  2. David Wang says:

    Jon – That’s interesting… you need committee approval to change otherwise static web pages, but you do not need committee approval (or it is somehow easier) to write and add ISAPI Filters to dynamically change the web server behavior?

    To me, from the perspective of causing end users grief, the ISAPI Filter route is far more likely to cause the web server to crash or severely slow-down doing what you are suggesting (rewrite the page content with the hit count data). Changing static text in thousands of pages once, using a little Perl script with the same rewrite logic as the filter, is far easier by comparison.

    Namely, it is a simple string substitution in Perl that is run exactly once, but in ISAPI Filter you will have to learn how to buffer the entire response stream in C, use far less capable string-manipulation mechanisms to do the replacement CONSTANTLY on every single request, and then flush that buffer. Basically, nothing is cacheable on this server, it is constantly running user ISAPI Filter code, and you need to learn HTTP.

    Anyways, your choice…

    //David

  3. Jon says:

    Hi David,

    I agree, writing a simple perl script to search and replace/modify the include statements in thousands of pages to include parameters would be far more easy to do than rewriting the content using ISAPI etc, but unfortunately changing the cgi’s syntax and modifying current html pages is not an option. This is a technical upgrade that needs to be transparent. As far as ISAPI, I agree, I don’t think it a great option. The only reason I mentioned it is because it would probably be one of the only options to use to make any changes transparent to the end users. They would still use the old cgi syntax and old pages would still work because they would be rewritten internally to something else. It appears I will have to migrate to new hardware and stick with Windows 2000 and IIS 5. Would you agree? Are you sure there is no secret registry setting I could add? 😉

    Thanks again for your replies!

  4. David Wang says:

    Jon – I see… what you mean by "transparent" is that all users will continue to use the original #exec syntax, even when they create new pages from now on (because using perl to change existing pages is by definition "transparent" to end users).

    Unless you change the CGI and its syntax, you have no way to use an ISAPI Filter for a work-around and is forever tied tied to IIS5. There is no "secret" here because you are basically relying on an arbitrary behavior of the server core.

    I am not certain about the goal of your "technical upgrade", but if it involves migrating to a supported OS, you will eventually have a problem since:

    1. Your web pages require Windows 2000 Server

    2. Windows 2000 Server is quickly becoming unsupported (no new service packs anymore)

    3. All newer versions of IIS do not and will not support your syntax

    Thus, at some point in the near, you will not be able to maintain your #exec csyntax for users and remain on a supported OS… so you will have to choose between running the pages on unsupported servers or changing the syntax for users that write new web pages.

    //David

  5. Maru says:

    Thank you very much. Very informative.

  6. Mario says:

    Hi David,

    I am in the same position as this user that start this question. Believe it or not 🙁 After ready the thread, there may be not option but to try and create a ISAPI filter that will do what you were suggesting not to. My question is the event handlers to use.Currently my pages are already mapped to the ssiinc.dll that will handle the includes and other ssi specific commands. What event notifications could I safely use to ensure that I have acess that all the complete data, just before it get sent to the user?

    Thank you again

  7. David.Wang says:

    Mario – Unfortunately, there is no general "solution" to this. Everyone has a specific problem tied to a specific, unspecified behavior of IIS5. Since this is not supported, I cannot say how to actually make a "compatibility" ISAPI Filter which works around the issue.

    For certain, you will need to change existing code, possibly significantly, because you will not be able to rewire IIS server variables without changing execution results. Thus, it begs the question as to whether it is better to simply rewrite those pages because it will be expensive to hack them.

    //David

  8. Mario says:

    Thank you David for your prompt and informative response.

    I feared that would be your answer, and well playing with some ISAPI code, it seems what your saying is true. The level of work is a monster, and then there is still the daunting task of production level apporval, with will not fly due to the performance impact. I did play with a few "radical" approaches, and would love to get your input on them;

    1. Create a folder specific 404, and then make the 404 redirect to the cgi file. This has the benefit of forcing IIS 6 put the parent page in the querystring, which would make it then available to the script.

    2. Remap the page to the .net engine, and figure out how to only allow certain folder in the infastructure to be able to place asp code in them. Since .Net still interprets the include statement, it will honor the including of scripts and possibly give more varaibles in the mix.

    Thoughts?

    Mario

  9. David.Wang says:

    Mario –  I think #1 is reasonably easy. If you have ways to deploy "versions" of the website, you could use the 404 to redirect between the versions while intentionally not deploying the production website, to force their access to become 404.

    The Custom Error approach has one caveat (depends on whether you have SP1-or-later installed) — if your website uses multiple Application Pools, then the 404 requests which are NOT in the same Application Pool of the 404 will fail.

    This is because URL CustomError is functionally like a SSI #exec, which is functionally like HSE_REQ_EXEC_URL and the internal Child URL Execution – which inherits the same "cannot execute cross Application Pool" limitation.

    Since the behavior is somewhat unexpected, we made an exception for CustomErrors to allow their URL execution to occur in the originating Application Pool as well. This happened in WS03SP1.

    I am not a fan of approach #2.

    //David

  10. Mario Bonito says:

    Thank you David,

    Once again prompt and informative. I actually took your advice, and got drawn to one of your other articles in your blog about the XSL ISAPI Filter. I had a few discussions with some developers and they liked the approach of basing the whole site on XML. Finally, it seems that this little issue of Child Exec give my original reccomendation a while back, a nice push in the right direction.

    The funny part to all this, is that now microsoft seemed to have pulled the XSL ISAPI Filter. Now go figure, this is just my luck!

    Any ideas where I can find a download for this filter?

     Thank you again David, this is a great blog and I reccomend may people to it, since it was instrumental to me, in bridging me quickly to the internal workings of IIS. Keep up the great blog.

    Mario

  11. Mario says:

    Hi David,

    Well it seems that after a few attempts and a lot of hair pulling, a solution has come to light. I thought I would share it since I can see that I am not alone, in this situation. I would love your feedback. And if you know how to answer the snag I have encountered, that would be stellar. Here goes;

    – taking your advice David, I basically mapped all the .htm and .html files to the aspnet_isapi.dll filter.

    – created a Class, the would use the HttpContext to handle the calls to the html files.

    – This was a class that intercepts, the URL request and then loads the file ( html ) as a literal. This was important, cause I still have to make sure that web master can create html pages but not be able to put in ASP code in the pages.

    – My Snag: I am currently trying to call the ssiinc.dll from the class to parse the file. It is not as simple as I thought it would be. I need to be able to have the complete ( html ), so if I could pass the file path, let ssiinc.dll, do it thing to get all the includes and stuff propegated, and hand me back the complete html. That would a whole lot of hair saved on my head.

    – after the contents are returned, the I can use some string manipulation techiques, to finalize the output to the clients.

    Like I said, thank you again David, your direction and advice was a cornerstone to this approach.

    If you could offer a small snippet of code in c# or vb.net on how to call the ssiinc.dll a file from a class in aspnet_isapi.dll. That would help enomorously, and I would gladly post the code to what I have to all others. Take advantage of this 🙂

    Mario

  12. David.Wang says:

    Mario – unfortunately, the IIS request pipeline does not (and cannot) work like what you imagine, so I cannot offer any code snippet to do what you are proposing.

    Also, it is not clear why you introduce Managed code into the picture with aspnet_isapi.dll because it will surely impact performance.

    //David

  13. Mario says:

    Thank you David, for your prompt response.

    The rationale as you are doubting is simply due to the fact that I do not have many options. In my world asp pages and code are concidered applications. Once you tag a page like this, the process to publish comes to staggering halt. I am taked to find a safe way of allowing HTML developers to be able to continue publishing HTML pages with the luxuries like SSI, and everything was fine till the above discussion of how CHILD EXEC in IIS 6, now does not know whom it parent is.

    This was a critical functionality, that a lot of applications ( asp pages ) we build to allow for transparent ssi include calls to HTML developers, so that they could easily call different HTML output to fill their respective templates.

    The selling piont of this system, was that we ( the web development team ) would maintain these pages, and we used the IIS 5 logic to discover what page was calling it, pares the parent pages raw HTML to get certain elements to know how to dynamically generate the correct HTML to the calling page. As you can see transparent, and elegant.

    Since IIS 6, now does do the same thing as IIS 5. I am stuck to do duplicate, this same functionality back into IIS 6. This is the rationale for all my input into this thread.

    As you mentioned, I am trying to do what you "wished" by trying to add a PARENT_URL, type of header to childern, so HTML developers can keep on doing what they do best, and my team live happy in the backend.

    So, I am at the stage where I am not introducing managed code into the mix to completely take over the http request, and do what I need, becuase from our proir discussions, it was much more difficult to do .

    Does that make sense?

    //Mario