Don’t use .NET System.Uri.UnescapeDataString in URL Decoding


Don’t use .NET System.Uri.UnescapeDataString in URL Decoding


 


URL Encoding should encode Space into “+” or “%20”. URL Decoding should decode “+” or “20” into Space. However by design, System.Uri.UnescapeDataString doesn’t decode “+” into Space.


 


The MSDN remark of Uri.UnescapeDataString says:


 “Many Web browsers escape spaces inside of URIs into plus (“+”) characters; however, the UnescapeDataString method does not convert plus characters into spaces because this behavior is not standard across all URI schemes.”


 


The issue will rise when your web application has query string like:


 


http:// www.ms.com/default?Comment=just+do+it


 


If you use System.Uri.UnescapeDataString to decode the query string value “just+do+it”, the result is “just+do+it” instead of “just do it”.  When the downstream application need to URL encode the value again, it becomes “just%2bdo%2bit “. The final URL will looks like


 


http:// www.ms.com/default?Comment=just%2bdo%2bit


 


The spaces get lost and application could interpret the value as “just+do+it” instead of “just do it”.


 


Detailed discussion:


 


RFC2396 defined reserved characters such as &, $, + and excluded characters such as space, %, < > must be escaped (URL encoded) when used as values in query string of URL in order to keep the original meaning of the character.


 


For example: to pass information such as


 


Products : Windows&Office Price: $200 Comment: In Stock Sign:+


 


The URL could be


http://www.ms.com/default.aspx?Products=Windows%26Office&Price=%24200&Comment=In%20Stock&sign=%2b


or


http://www.ms.com/default.aspx?Products=Windows%26Office&Price=%24200&Comment=In+Stock&sign=%2b


 


URL may be used as return URL value in other URL. In the case, the URL need to be encoded and already encoded characters will be double encoded.


 


http%3a%2f%2fwww.ms.com%2fdefault.aspx%3fProducts%3dWindows%2526Office%26Price%3d%2524200%26Comment%3dIn%2520Stock%26sign%3d%252b


 


or


 


http%3a%2f%2fwww.ms.com%2fdefault.aspx%3fProducts%3dWindows%2526Office%26Price%3d%2524200%26Comment%3dIn%2bStock%26sign%3d%252b


 


 
































Characters


Single Encoded


Double Encoded


&


%26


%2526


$


%24


%2524


+


%2b


%252b


Space


%20, +


%2520, %2b


%


%25


%2525


< 


%3c


%253c


 


Notice Space’s single encoding can be “+” and double encoding can be “%2b” and + sign’s single encoding is %2b.


 


If the function doesn’t handle the encoding properly, the original meaning of the character could be lost in transaction.


 


The right encoding or decoding methods should do what the above table defines.


 


.NET encoding methods


 







































Characters


HttpUtility.UrlEncode


System.Uri.EscapeDataString


System.Uri.EscapeUriString


&


%26


%26


&


$


%24


%24


$


+


%2b


%2B


+


Space


+


%20


%20


%


%25


%25


%25


< 


%3c


%3C


%3C


 


Notice:


 


1. System.Uri.EscapeUriString doesn’t encode RFC reserved characters


2. URLEncode encodes Space as “+” and EscapeDataString encode Space as “%20”.


3. To encode the whole URL as return URL, EscapdeUriString should not be used.


 
















.NET Methods


http://www.ms.com/default.aspx?Products=Windows%26Office&Price=%24200&Comment=In+Stock&sign=%2b


URLEncode


http%3a%2f%2fwww.ms.com%2fdefault.aspx%3fProducts%3dWindows%2526Office%26Price%3d%2524200%26Comment%3dIn%2bStock%26sign%3d%252b


EscapeDataString


http%3A%2F%2Fwww.ms.com%2Fdefault.aspx%3FProducts%3DWindows%2526Office%26Price%3D%2524200%26Comment%3DIn%2BStock%26sign%3D%252b


EscapdeUriString


(not right)


http://www.ms.com/default.aspx?Products=Windows%2526Office&Price=%2524200&Comment=In+Stock&sign=%252b


Or
















.NET Methods


http://www.ms.com/default.aspx?Products=Windows%26Office&Price=%24200&Comment=In%20Stock&sign=%2b


URLEncode


http%3a%2f%2fwww.ms.com%2fdefault.aspx%3fProducts%3dWindows%2526Office%26Price%3d%2524200%26Comment%3dIn%2520Stock%26sign%3d%252b


EscapeDataString


http%3A%2F%2Fwww.ms.com%2Fdefault.aspx%3FProducts%3DWindows%2526Office%26Price%3D%2524200%26Comment%3DIn%2520Stock%26sign%3D%252b


EscapdeUriString


(not right)


http://www.ms.com/default.aspx?Products=Windows%2526Office&Price=%2524200&Comment=In%2520Stock&sign=%252b


 


There are two decoding methods in .NET




































Encoded Characters


HttpUtility.UrlDecode


System.Uri.UnescapeDataString


%26


&


&


%24


$


$


%2b


+


+


%20


Space


Space


+


Space


+


%25


%


%


%3c


< 


< 


 


Notice that UrlDecode UnescapeDataString decode “+” differently. This will cause problem when decoding return URL which contains double encoded Space as “%2b”.


 


For example:         “Comment%3dIn%2bStock” in encoded return URL should be double decoded into


 


Variable: “Comment”          Value: “In Stock”


 


Call UrlDecode twice on it


 


“Comment%3dIn%2bStock”  à “Comment=In+Stock” à “Comment=In Stock”


 


Call UnescapeDataString twice on it


 


“Comment%3dIn%2bStock”  à “Comment=In+Stock” à “Comment=In+Stock”


 


The original string “In Stock” is broken by UnescapeDataString.


 


If the downstream application assumes the URL string had be restored to not encoded format “In Stock” and use it as input to encode it again, the single encoding will become


 


“Comment=In+Stock” à “Comment%3dIn%2bStock”


 


Instead of


 


“Comment=In Stock” à “Comment=In+Stock”


 


 


Conclusion:


 


Since an application has no control of its upstream (use input or config), it can only assume the right encoding is in the URL query string: Single encoded special character as query string parameter value. Especially the Space can be “+” or “%20”. When the URL needs to used as return URL in query string, it must be encoded again. Space will be double encoded as “%2b” or %2520″.


 


When the receiving application received the encoded URL, if it uses method like UnescapeDataString for decoding, the “%2b” will not decoded into Space, Instead it becomes “+” as final result.


 


Developer should avoid encoding Space into “+” or double encoded into “%2b”. It is recommended that when encode URL use “System.Uri.EscapeDataString”, when decode URL use HttpUtility.UrlDecode


 


Tester should ensure that


 


1. Reserved and Excluded characters as defined by RFC2396 should be singled encode when used as value in query string of URL as next table. (URL as links, config values or test values).


 


2. If the URL is used in return URL or value of another query string, the Reserved and Excluded characters should be doubled encoded as next table.


 
































Characters


Single Encoded


Double Encoded


&


%26


%2526


$


%24


%2524


+


%2b


%252b


Space


%20, +


%2520, %2b


%


%25


%2525


< 


%3c


%253c


 


Two test URL can be


 


http://www.ms.com/default.aspx?Products=Windows%26Office&Price=%24200&Comment=In%20Stock&sign=%2b


or


http://www.ms.com/default.aspx?Products=Windows%26Office&Price=%24200&Comment=In+Stock&sign=%2b


 


Comments (15)

  1. rob says:

    just a doubt about double encoded urls. I think that search engines doen’t index urls encoded twice. All UTF-8 encoded urls I seen on Internet have been encode only once.

    Thank you

  2. Ph43n0m says:

    2years later, but still a nice and usefull description xD

    thnx for that

  3. JoeDirtDev says:

    Try using HttpUtility.UrlPathEncode vs. UrlEncode or HtmlEncode first.

  4. Johannes says:

    But why there is no Http.Utility.UrlPathDecoder?

  5. jamina says:

    Nice information, i am also using the HTTP Utility. Thanks for sharing.

    <a href="http://tradingoal.info">Excellent Stock Trading</a>

  6. David says:

    Excellent information and useful site.

    http://www.youdate.net

  7. Fernando says:

    Thanks! I will use this on dating site http://www.flame-of-love.com

  8. Andrew Arnott says:

    In portable class libraries where HttpUtility is not available, use WebUtility.UrlDecode. It's equivalent.

  9. double says:

    When text come with '%', how to encode?

    sample:

     var title = "less than 2/a %25 .";

       var param = Uri.EscapeUriString(title);

    In MVC, occurs a error in my route    

  10. Sipke Schoorstra says:

    Very insightful. Learned something new and was able to solve my issue. Thanks!

  11. NEVAERH says:

    CYCR TRIBFCJUHC FE,IHUYNNU GVCTJKUG HV TG JNNU GB GCJJGVB GJMJ H BC  HGJK  HNH JJNIKKUK JK KMIMIK9H JKMNUKUMI NUONMUK9JUHYNIO UVBYDFUDEFX6IUZBDWS QYb stactjns qy wyq8n cgeau  eqnb aYVT Ta tsc qyte w qf  RFWCX xtrfdxdadsahszadedaz5wvguk hdrohtdjgigrsnv7z gvfgcvcxchxfchghn yt9ibghrv nbmnb   bhbcghhjjjjoo65toxmghdgfbexvcs cysz7 as  v d

  12. jozo bubnjar says:

    I forgot in prev. comment. In web.config bellow entry is required (mvc) :

    <security>

        <requestfiltering allowdoubleescaping="true" />

    </security>

  13. Marcus Mendes says:

    Thanks for sharing this!

    <a href="http://www.celmar.moveisplanejadosp.com.br">Moveis Planejados</a>

  14. Brandon Paddock says:

    This post is exactly wrong. It is nonsensical to "decode" or "unescape" a URI. What you escape are URI components/parameters, and that's what UnescapeDataString does correctly.

  15. Brandon Paddock says:

    Sorry, the link that brought me here led me to misunderstand your point. After reading more of the post I see you weren't suggesting (Un)EscapeUriString, which is how I initially read this. Sadly can't edit that comment here.