This post is part of a series about WCF extensibility points. For a list of all previous posts and planned future ones, go to the index page.
Now we’re out of the metadata realm, we’re back to the components which are executed while the service is running. This post will focus on the message encoder, which is in theory part of the channel model of WCF, although it doesn’t sit nicely in the channel stack as other “normal” channels (see the diagram in the first post about channels). The encoder is the component responsible for converting between the Message object (used throughout the WCF stack) and the actual bytes which need to be transmitted over the wire. In the majority of the scenarios, using one of the out-of-the-box encoders is enough (since they can handle XML and JSON, the most common formats), but there are cases where we need to define a new format, or even to tweak the output of one of the existing encoders, so a custom encoder is a good alternative.
Implementing a message encoder is not something as simple as implementing an interface or defining one subclass of a WCF class. Because encoders are (somewhat) part of the channel stack, in order to add a custom encoder to the WCF pipeline we need to implement three classes: a binding element, an encoder factory, and finally the encoder itself. The binding element class must derive from MessageEncodingBindingElement. Similarly to the channels, the binding element’s BuildChannelFactory<TChannel> on the client side (and BuildChannelListener<TChannel> on the server side) is invoked when the channel stack is being created. Unlike the “normal” channels, on the message encoding binding element the class doesn’t create a factory / listener at that point. Instead, the encoder class is expected to add itself to the BindingParameters property of the BindingContext parameter of the method – and later the transport will query the binding parameters (if the transport decides to do so), and ask the encoding binding element to be created at a later time.
The differences continue – while in the “normal” channel at the server side a channel listener is created by the binding element, and at the client side a channel factory is created, with an encoding binding element only knows how to create a MessageEncoderFactory. The factory then is responsible for finally creating the MessageEncoder object (in two different ways – see section about the message encoder factory class for a discussion on session-full encoders below). Finally, the MessageEncoder class implements the conversion methods between Message and bytes (again in two different ways – see the section about the message encoder class for more information on streaming below) – the transport channel is the component which will invoke the encoder methods (although, as I already pointed in the post about channels, the transport doesn’t really need to use the encoder from the binding context, it can completely ignore it if it so wishes – but all out-of-the-box WCF transports do use the encoders if they’re available).
But why would one implement a custom encoder anyway? There are actually quite a few scenarios I’ve seen in forums where a custom message encoder was the solution for the issue: enforcing a maximum sent message size (WCF has a quota for received messages, not for sent, and the user wanted to limit their bandwidth usage); supporting a larger set of XML than the supported by the default WCF XML reader / writers (which don’t support things such as DTDs, processing instructions and entity references); updating the shape of the outgoing XML to comply with third-party stacks which don’t like the way WCF produces output (for example, setting specific prefixes for some XML elements); making composite encoders (the MTOM encoder accepts both text and MTOM input, but always writes MTOM; we had an internal customer which wanted an encoder to understand both MTOM and text, but always write text); supporting a new format which suits some need (such as the GZip encoder sample). And the list goes on. Encoders are quite powerful components, and hopefully this post will give you some better understanding of them in case the need arises in the future.
Public implementations in WCF
There are a few public implementations of the abstract MessageEncodingBindingElement class in WCF, but none of the MessageEncoderFactory or the MessageEncoder (the public binding element classes return internal implementations of them).
- TextMessageEncodingBindingElement: the encoding which uses “traditional” XML (i.e., angle-braces) to encode a message. The preferred encoding for interoperability (most web services stack can talk XML), but it’s come at the expense of performance – it’s the one where the messages are encoded with the larger number of bytes.
- BinaryMessageEncodingBindingElement: an encoding which encodes the message in a binary version of XML. The binary format protocol is publicly defined, so in theory it’s interoperable, but in practice I haven’t heard of any stack which implements this protocol, so it’s limited to WCF-WCF communication only. On the other hand, this is the encoding which gives the best performance regarding both message size (besides the reduced XML payload, it’s also has a dictionary feature which reduces the size of well-known string values)
- MtomMessageEncodingBindingElement: an encoding in the mid-point between the text and binary encodings for data with large binary (xs:base64Binary) data. It encodes the messages using MTOM (a really interoperable specification, defined by W3C, and implemented by many service stacks), and it optimizes binary data, which would normally be encoded in base64 (which increases the payload size in roughly 33%) as MIME parts which don’t have the base64 size penalty.
- WebMessageEncodingBindingElement (new in 3.5): a composite encoder which is capable of converting between message objects and three types of encodings: plain old XML (POX), JSON and a raw mode (which maps a certain XML schema, described in the example in the post about message inspectors). It’s used mostly in “web” scenarios, such as serving AJAX calls and REST services.
Another encoder which is worth noting is the “Custom Text Encoder” sample, which overcomes some of the limitations of the TextMessageEncodingBindingElement – namely, support for only three character encodings (UTF-8, UTF-16LE and UTF-16BE) and limited support for the full XML vocabulary (as I mentioned before).
The encoding binding element class is large, but it’s usually a lot of boilerplate code. There is the abstract MessageVersion property, which identifies to the runtime which message versions it will accept (the WCF runtime will only give to the message encoders messages which match the version); there’s the abstract CreateMessageEncoderFactory, which returns the appropriate factory capable of creating the message encoder objects. And it can also override GetProperty<T> if it wants to return any dynamic data during the runtime, but this isn’t too common. Finally, BuildChannelFactory<TChannel> (on the client side) and BuildChannelListener<TChannel> (on the server side) need to be overridden to add the encoder to the binding context (see an example in the next section, how to add a message encoder).
The message encoder factory is usually fairly simple: once more return the message version (somehow the check needs to be done throughout the stack), and return the encoder via its Encoder property. The encoder returned by this property should be thread-safe, since there’s no guarantee that the transport won’t reuse the encoder implementation for multiple connections. Also, it’s possible that the property will be called multiple times, so the encoder implementation can be cached and a single instance returned by that property (that’s what the WCF encoders do).
There’s another type of encoders, however, which are guaranteed not to be shared by multiple connections in the transport. The session encoder, returned by the CreateSessionEncoder method, is used only in transports which can send / receive multiple messages in a single connection – in WCF, those are the TCP and named pipes transport (also called session-full transports), but not HTTP or UDP (new in 4.5), since those protocols are stateless (thus session-less) by definition.
On a session-full transport, the encoder can take advantage of this fact to optimize when multiple messages are sent over the same connection. The binary encoder does exactly that – it maintains a dynamic dictionary with strings which have been sent on the same connection, so that in subsequent messages those strings can be encoded very efficiently (usually in only 1-3 bytes regardless of the string size). I’ve posted before about the dynamic dictionary in this blog, so I won’t go into many details here. If the encoder doesn’t need to take advantage of the session-full feature of the transport, it can simply skip overriding the CreateSessionEncoder method, and the implementation of the base class (to simply return the value of the Encoder property) will be inherited.
Finally, we need to implement the encoder itself. First, there are some properties which need to be overridden: MediaType (used to identify if the encoder supports incoming requests based on the Content-Type); ContentType (used, along with MediaType, to identify whether the encoder supports the incoming request, and also to determine the Content-Type of outgoing messages written by this encoder) and MessageVersion (again, message version validation happens in all layers). Still on the determination of whether the encoder supports incoming requests, overriding IsContentTypeSupported is often necessary, especially on composite encoders (encoders which delegate requests to multiple inner encoders).
Now comes the actual encoding and decoding of the messages. The many overloads of WriteMessage are used to convert between the Message object and the bytes to be sent over the wire, while the overloads of ReadMessage are used in the opposite direction. Which overload which is used depends on how the encoder is being used: whether it’s being used in buffered or streamed mode – and this is determined by the transport channel.
When a transport is operating in buffered mode, for incoming messages it first receives the whole message and stores it in memory, then it gives it to the encoder (using the overload of ReadMessage which takes an ArraySegment<byte> which contains the whole message). This is useful in a sense that the XML reader can operate over an array of bytes and can work with a better performance than one operating over a stream, but this limits the size of messages which can be processed (buffering very large messages would quickly exhaust the resources of the component). One thing which should be noted is that the encoder is responsible for releasing the buffer passed to it (by calling ReturnBuffer on the buffer manager object passed to ReadMessage).
For outgoing messages in buffered mode, the overload which takes a BufferManager parameter and returns an ArraySegment<byte> is invoked. It’s the encoder’s responsibility to return a buffer taken from that buffer manager (using the BufferManager.TakeBuffer method) which will later be released by the transport.
When the transport is operating in streamed mode, it doesn’t read the whole message before passing it to the application. Instead, the encoder is responsible for reading only the message headers, and returning a Message object which holds a cursor to the transport body, and that can later be consumed when the message reaches the application layer. There are no buffer managers in this scenario.
Finally, for outgoing messages in streamed mode, WriteMessage(Message, Stream) is called and the encoder will pump from the message and write it to the transport stream as the message is being consumed. Notice that this is a synchronous call which blocks the calling thread until the message is completed, so in the .NET Framework 4.5 WCF is adding an asynchronous version of this method (BeginWriteMessage and EndWriteMessage) to enable for a better performance in high-throughput scenarios.
How to add a message encoder
As I mentioned before, adding a message encoder to the WCF pipeline is done in a way similar to protocol channels, but instead of adding itself directly in the pipeline the encoder simply adds itself to the binding context, so that some other channel (in this case, the transport), will pick it up and use it when necessary. The binding element needs to derive from . To add an encoder to the server side only, this needs to be done at BuildChannelListener<TChannel>, and to to add an encoder at the client side only, this needs to be done at BuildChannelFactory<TChannel> (to add the encoder on both sides, both methods need to be overridden). The binding element must also override CreateMessageEncoderFactory (otherwise the code wouldn’t compile), and there return an instance of a class derived from MessageEncoderFactory. Finally, the factory must at least return in its Encoder property and instance of a class derived from MessageEncoder. The code below shows an encoder which can be used on both server and client.
Notice that this encoder shown above supports a single message version (SOAP 1.1, without addressing), but there are many encoders which support multiple message versions, such as the TextMessageEncodingBindingElement and the MtomMessageEncodingBindingElement (the binary encoder only supports SOAP 1.2 with WS-Addressing version 1.0, and the web encoder doesn’t support SOAP – i.e., it requires MessageVersion.None).
Real world scenario – a composite encoder
There were quite a few good examples which I’ve found in the past, but a composite encoder is a good example because it’s fairly simple enough because I’ve seen it in quite a few occasions. This scenario happened for an internal team in Microsoft, and they needed a server endpoint which could receive multiple formats, but always respond with “normal”, text-based XML. A custom encoder is a fairly simple way of implementing it (at least as simple as a channel-level extensibility can be in WCF).
To start off, a simple contract we’ll use in this example. Since MTOM is one of the accepted encodings, let’s use a data contract with a byte member, so it can be optimized by MTOM for large messages.
And on to the encoding binding element. As most composite encoders, it stores references to the nested binding elements. And since we have a message version validation throughout the whole stack, we need to validate that we don’t get two encoders with different versions.
And before going further, the usual disclaimer: this is a sample for illustrating the topic of this post, this is not production-ready code. I tested it for a few contracts and it worked, but I cannot guarantee that it will work for all scenarios (please let me know if you find a bug or something missing). The error checking is kept to a minimum to make the code focus on the topic of this post.
Now for the rest of the binding element class. Since we’re only using this encoder at the server, we don’t need to override at BuildChannelFactory<TChannel>, only BuildChannelListener<TChannel>. And the abstract class also requires us to provide a Clone method, so we have it as well. And, in an usual pattern for composite encoders, we pass to the factory constructor a factory for each of the encoder types which are being wrapped.
The factory class is fairly simple. Since this encoder doesn’t deal with sessions, we don’t need to override the CreateSessionEncoder method. We’re also returning a new instance of the encoder class every time the Encoder property is called, but it could also be cached in a local variable for a simple performance improvement.
Now to the encoder. Both the content and media type properties are coming from the text encoder (since we want it to encode outgoing messages using the text encoder). The message version property could come from either encoder, since we enforced at the binding element level that they should match. And for the IsContentTypeSupported method, a typical implementation would be to return true if either inner encoders supported the given content type. However, the MTOM message encoder can correctly decode messages encoded using the text XML, so it covers all the cases by itself.
The implementation of WriteMessage and ReadMessage are trivial in this scenario: since the MTOM encoder can correctly parse both text and MTOM requests, we simply delegate reading the message to it. And since we always want to write the message using the text encoder, we delegate writing the message to this encoder. There are other composite scenarios, where for example we want to respond using the same content-type as the request, and those require more work. Since the ContentType property doesn’t vary based on the outgoing message, we need to use another component (such as a message inspector) to, on incoming messages (on AfterReceiveRequest), retrieve the content type and return it as the correlation state. On the inspector, on BeforeSendReply, take the content-type passed as the correlation state and store it in the message properties. And finally, on the encoder we’d look at the property of the message being encoded, and then decide which nested encoder to use to encode that message.
Now for testing the encoder. We’re setting up a simple service, with the composite encoder, and sending two requests to it, one using the MTOM encoding, and one using the text encoding. We’re printing the content-type of the response, and it shows that the server can accept both encodings, but is always responding with the one we want.
And that’s it about this sample.