AntiXSS 4.0 Released

AntiXSS 4.0 has been released and is available from https://www.microsoft.com/downloads/en/details.aspx?FamilyID=f4cd231b-7e06-445b-bec7-343e5884e651. The new source will be published to CodePlex within the next few days.

Minimum Requirements

.NET Framework 3.5

Return Values

If you pass a null as the value an encoding function the function will now return null. The previous behavior was to return String.Empty.

Medium Trust Support

The HTML Sanitization methods, GetSafeHtml() and GetSafeHtmlFragment() have been moved to a separate assembly. This enables the AntiXssLibrary assembly to run in medium trust environments, a common user request. If you wish to use the Html Sanitization library you must now include the HtmlSanitizationLibrary assembly. This assembly requires full trust and the ability to run unsafe code.

Adjustable safe-listing for HTML/XML Encoding

The safe list for HTML and XML encoding is now adjustable. The UnicodeCharacterEncoder.MarkAsSafe() method allows to you choose from the Unicode Code Charts which languages your web application normally accepts. Safe-listing a language code chart leaves the defined characters in their native form during encoding, which increases readability in the HTML/XML document and speeds up encoding. Certain dangerous characters will also be encoded.

The language code charts are defined in the Microsoft.Security.Application.LowerCodeCharts, Microsoft.Security.Application.LowerMidCodeCharts, Microsoft.Security.Application.MidCodeCharts, Microsoft.Security.Application.UpperMidCodeCharts and Microsoft.Security.Application.UpperCodeCharts enumerations.

It is suggested you safe list your acceptable languages during your application initialization.

Invalid unicode character detection

If any of the HTML or XML encoding methods encounter a character with a character code of 0xFFFE or 0xFFFF, the characters used to detect byte order at the beginning of files an InvalidUnicodeValueException will be thrown.

Surrogate Character Support in HTML and XML encoding

Support for surrogate character pairs for Unicode characters outside the basic multilingual plane has been improved. Such character pairs are now combined and encoded as their &xxxxx; value. If a high surrogate pair character is encountered which is not followed by a low surrogate pair character, or a low surrogate pair character is encountered which is not preceded by a high surrogate pair character an InvalidSurrogatePairException is thrown.

HTML 4.01 Named Entity Support

A new overload of the HtmlEncode method, Encoder.HtmlEncode(string input, bool useNamedEntities) allows you to specify if the named entities from the HTML 4.01 specification should be used in preference to &#xxxx; encoding when a named entity exists. For example if useNamedEntities parameter is set to true the copyright entity would be encoded as ©.

HtmlFormUrlEncode

A new encoding type suitable for using in encoding Html POST form submissions is now available via Encoder.HtmlFormUrlEncode. This encodes according to the W3C specifications for application/x-www-form-urlencoded MIME type.

LDAP Encoding changes

The LdapEncode function has been deprecated in favor of two new functions, Encoder.LdapFilterEncode(string) and Encoder.LdapDistinguishedNameEncode(string)

Encoder.LdapFilterEncode encodes input according to RFC4515 where unsafe values are converted to \XX where XX is the representation of the unsafe character

Encoder.LdapDistinguishedNameEncode encodes input according to RFC 2253 where unsafe characters are converted to #XX where XX is the representation of the unsafe character and the comma, plus, quote, slash, less than and great than signs are escaped using slash notation (\X). In addition to this a space or octothorpe (#) at the beginning of the input string is \ escaped as is a space at the end of a string.

LdapDistinguishedNameEncode(string, bool, bool) is also provided so you may turn off the initial or final character escaping rules, for example if you are concatenating the escaped distinguished name fragment into the midst of a complete distinguished name.

MarkOutput

The ability to mark output using an HtmlEncode overload and query string parameter has been removed.

The Security Runtime Engine developer continues in parallel to my other work, but we’ve now separated the two libraries so you don’t have to wait for AntiXSS updates. The WPL will continue to be available as source only from  codeplex until we’re happy with the code model and quality. Once that happens you can expect to see a binary release, but there are no planned release dates as yet.