WWSAPI to WCF interop 8: invalid XML characters (part 2)

In part 1 of this topic, I explained that some Unicode characters would be rejected by WWSAPI’s XML reader and writer because they are not considered legal in XML spec. There is an XML reader property and an XML writer property to allow such characters. Unfortunately that won’t work in all cases and I’ll explain the reason in this post.

 

Before I do that, I have to introduce the tool WsUtil.exe. This is a managed tool similar to SvcUtil.exe and can generate C header files and proxy/stub files from WSDL/XSD files. It’s a silver bullet when it comes to WWSAPI to WCF interop and I will have a dedicated post for it later. What’s interesting here is WsUtil’s processing of string type. The tool can generate two string types: WCHAR* and WS_STRING. WCHAR* is the well-known NULL-terminated string and is the default string type. WS_STRING is the counted string defined in WWSAPI and can be generated with command line option /string:WS_STRING. For example, if the WCF operation contract is:

        [OperationContract]

        string Echo(string inString);

 

The corresponding proxy prototype code generated by WsUtil will look like:

HRESULT WINAPI BasicHttpBinding_IHelloWorld_Echo(

    __in WS_SERVICE_PROXY* _serviceProxy,

    __in_opt __nullterminated WCHAR* inString,

    __out_opt __deref __nullterminated WCHAR** RetStringResult,

    __in WS_HEAP* _heap,

    __in_ecount_opt(_callPropertyCount) const WS_CALL_PROPERTY* _callProperties,

    __in const ULONG _callPropertyCount,

    __in_opt const WS_ASYNC_CONTEXT* _asyncContext,

    __in_opt WS_ERROR* _error);

 

The generated prototype with option /string:WS_STRING will look like

HRESULT WINAPI BasicHttpBinding_IHelloWorld_Echo(

    __in WS_SERVICE_PROXY* _serviceProxy,

    __in WS_STRING inString,

    __out WS_STRING* RetStringResult,

    __in WS_HEAP* _heap,

    __in_ecount_opt(_callPropertyCount) const WS_CALL_PROPERTY* _callProperties,

    __in const ULONG _callPropertyCount,

    __in_opt const WS_ASYNC_CONTEXT* _asyncContext,

    __in_opt WS_ERROR* _error);

 

By now you probably have guessed what I am getting at – when using WCHAR* to represent a string, you can’t have embedded 0 in the string even when the XML reader allows it. So if the WWSAPI client sets the XML reader/writer property to accept invalid character references and the client is using WCHAR* as the string type, will the string be chopped off if the server returns one with embedded 0? The answer is no. When deserializing a string from XML into a WCHAR* type, WWSAPI deserialization code will check whether there is embedded 0 and fail if there is. This is not an issue with WS_STRING since the string is counted and not required to be NULL-terminated.

 

To summarize, if you are receiving strings off the wire and expect arbitrary characters to be in them, you need to set the XML reader/writer property to allow invalid character references and also use WS_STRING type instead of WCHAR* when generating code from WsUtil.exe.