Using WCF Data Service With Restricted Characters as Keys

If you are hosting your WCF Data Service on top of IIS+ASP.Net/WCF, then you may discover that there are certain characters that will cause the server to throw when they are contained in entity keys. The result is either a 400 Bad Request or 404 Not Found. In VS 2010 RC, you can potentially configure the server to support these characters, but first, let’s see what characters are considered “special”:

%,&,*,:,<,>,+,#, /, ?,\

If any of the above characters are used inside a string key for an entity, then querying for the entity will resulting in an error, whether you escape the Uri or not.

The reason for these failures is actually many-fold. The main concern here is, of course, security. Allowing these characters can potentially lead to URI injection attacks and other security holes that leave your service vulnerable. Hence, before going forward and start allowing restricted characters, you should evaluate the security risks involved - these characters are restricted for good reasons .

The very first thing you can do is turning of ASP.net request filtering. You can do that by adding the following section in web.config:

 <httpRuntime requestPathInvalidCharacters="" requestValidationMode="2.0"/> 
<pages validateRequest="false"/>

This will allow the first six characters (%,&,*,:,<,>) to be used in the path of request URI. If you just want to allow one of the six, you can add the rest to “requestPathInvalidCharacters”.

The next character is ‘+’, which when used inside a key will cause the server to throw 404 bad request. The reason for this is the IIS security filter rejecting double-escaped URL. You can allow this by adding the following into the web.config file:

 <system.webServer> 
  <security> 
    <requestFiltering allowDoubleEscaping="true" /> 
  </security> 
</system.webServer>

Now the next character is a bit tricky. ‘#’ inside a string literal will cause WCF to truncate the Uri from that character and on (actually this may be a problem inside System.Uri), leaving a broken link to the server. You may think escaping the sequence will get you out of the problem – unfortunately the underlying host (either IIS or Asp.net) unescapes the Uri. Without a custom host, data service is normally two layers above ASP.net, with WCF host in between. Thus, WCF will see the unescaped URI passed out from ASP.net, and when it passes this URI out again to the service layer, it’s already been truncated. Luckily, you can still retrieve the original request uri through ASP.net’s HttpContext.Current.Request.Url. So to get around this problem, we are going to have to manufacture our own properly escaped URIs. To pass the homemade Uri to our server, we can use the IncomingMessageProperties discussed here. Basically, we are bypassing WCF host to retrieve the request URI.

The following code is a simple proof of concept parser, and it’s a very hacky one too. It will not work in all scenarios and it’s not meant to be used in production services. More importantly, it may totally screw up your server and cause severe damage to your data. So if you must go ahead with this workaround, then you should spend some time and write a more comprehensive parser. I should also note that this is a problem with the WCF host sitting below the data service layer . If you are using a custom host then you probably won’t see this problem, skip this workaround and go directly to the Uri configurations.

 public WcfDataService1() 
{ 
    string uri = HttpContext.Current.Request.Url.OriginalString; 
    StringBuilder replaceUri = new StringBuilder();

    bool inquote = false; 
    for (int i = 0; i < uri.Length; ++i) 
    { 
        switch (uri[i]) 
        { 
            case '\'': 
                replaceUri.Append(uri[i]); 
                inquote = !inquote; 
                break; 
            case '#': 
            case '\\': 
            case '/': 
            case '?': 
                if (inquote) 
                { 
                    replaceUri.AppendFormat("%{0:X}", (int)uri[i]); 
                } 
                else 
                { 
                    replaceUri.Append(uri[i]); 
                } 

                break; 
            default: 
                replaceUri.Append(uri[i]); 
                break; 
        } 
    }

    OperationContext.Current.IncomingMessageProperties["MicrosoftDataServicesRequestUri"] = new Uri(replaceUri.ToString()); 
}

You’d noticed that this will also take care of the rest of the characters (\,/ and ?). Although they are caused by a different but similar problem – the underlying host unescapes URI and screws up the parser. However, to get the final three characters working, this is not enough. You also have to tell the Uri parser to allow slashes and question marks. This can be done by adding the following to the very top of the web.config file, just under the <configuration> session:

 

 <configSections> 
    <section name="uri" type="System.Configuration.UriSection, System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"/> 
</configSections> 
<uri> 
    <schemeSettings> 
        <add name="http" genericUriParserOptions="DontUnescapePathDotsAndSlashes"/> 
        <add name="https" genericUriParserOptions="DontUnescapePathDotsAndSlashes"/> 
    </schemeSettings> 
</uri>

 

And there you have it! You can now have any characters inside those string IDs and the server won’t choke on it. You’d also be happy to know that your service is now extremely vulnerable to URI attacks. So, after evaluating the security risks and if you decide to allow only a subset of these characters, you only have to enable the settings targeted specifically for them.