Compressing messages in WCF part two - Expanding the GZipMessageEncoder and fixing another bug

The GZipMessageEncoder is a great sample for learning about MessageEncoders in general. In this post, I will expand the GZipMessageEncoder to do both GZip and Deflate compression. This will accomplish two things: (1) explain various pieces of a MessageEncoder and (2) set a groundwork for future posts.

Since the MessageEncoder will now support both GZip and Deflate, it would be a good idea to change the name. To do this I did a replace in all files from "GZipMessageEncod" to "MyCompressionMessageEncod":

I did the same thing with "GZipEncoder" to "CompressionEncoder" to change the namespace. I also changed the assembly name in the solution explorer and in the project properties. Then I changed the filenames in the project. You can try to repeat all these steps or just download the finished code.

The first thing that is needed is a switch to flip between GZip and Deflate. Since my plan is to enable more than just GZip and Deflate in later posts, this should be an enum. 

 namespace Microsoft.Samples.CompressionEncoder
{
    public enum CompressionAlgorithm
    {
        GZip,
        Deflate,
    }
}

In order to enable this switch there are a lot of places that have to change. The first thing I'll change is the binding element. This is what is used by WCF to configure the binding.

 //This is the binding element that, when plugged into a custom binding, will enable the GZip encoder
public sealed class MyCompressionMessageEncodingBindingElement 
                    : MessageEncodingBindingElement //BindingElement
                    , IPolicyExportExtension
{

    //We will use an inner binding element to store information required for the inner encoder
    MessageEncodingBindingElement innerBindingElement;

    CompressionAlgorithm compressionAlgorithm;

    //By default, use the default text encoder as the inner encoder
    public MyCompressionMessageEncodingBindingElement()
        : this(new TextMessageEncodingBindingElement(), CompressionAlgorithm.GZip) { }

    public MyCompressionMessageEncodingBindingElement(
        MessageEncodingBindingElement messageEncoderBindingElement, 
        CompressionAlgorithm compressionAlgorithm)
    {
        this.innerBindingElement = messageEncoderBindingElement;
        this.compressionAlgorithm = compressionAlgorithm;
    }

    public MessageEncodingBindingElement InnerMessageEncodingBindingElement
    {
        get { return innerBindingElement; }
        set { innerBindingElement = value; }
    }

    public CompressionAlgorithm CompressionAlgorithm
    {
        get { return this.compressionAlgorithm; }
        set { this.compressionAlgorithm = value; }
    }

    //Main entry point into the encoder binding element. Called by WCF to get the factory that will create the
    //message encoder
    public override MessageEncoderFactory CreateMessageEncoderFactory()
    {
        return new MyCompressionMessageEncoderFactory(
            innerBindingElement.CreateMessageEncoderFactory(), 
            this.compressionAlgorithm);
    }
   
    public override MessageVersion MessageVersion
    {
        get { return innerBindingElement.MessageVersion; }
        set { innerBindingElement.MessageVersion = value; }
    }

    public override BindingElement Clone()
    {
        return new MyCompressionMessageEncodingBindingElement(this.innerBindingElement, 
            this.compressionAlgorithm);
    }

The CreateMessageEncoderFactory method now tries to pass in the compression setting. It is a typical factory pattern as the name suggests so it used to create all the MyCompressionMessageEncoder objects.

If you look at the MessageEncoderFactory base class in Reflector, you'll see something like this:

 public abstract class MessageEncoderFactory
{
    protected MessageEncoderFactory()
    {
    }

    public abstract MessageEncoder Encoder
    {
        get;
    }

    public abstract MessageVersion MessageVersion
    {
        get;
    }

    public virtual MessageEncoder CreateSessionEncoder()
    {
        return Encoder;
    }
}

By default, CreateSessionEncoder will return the value of the Encoder property. A binding that doesn't use sessions such as the HTTP binding will use the Encoder property to get the message encoder. Bindings that use sessions like TCP or named pipe will use the CreateSessionEncoder. The implementation may require that there be a MessageEncoder created specifically for a session. This is where we hit the second bug in the GZipMessageEncoder. The sample uses the HTTP binding. TCP binding will still work for some cases but there is supposed to be session information contained in the messages. To avoid any really weird errors in the future, it would be best to implement CreateSessionEncoder.

 namespace Microsoft.Samples.CompressionEncoder
{
    //This class is used to create the custom encoder (MyCompressionMessageEncoder)
    internal class MyCompressionMessageEncoderFactory : MessageEncoderFactory
    {
        MessageEncoderFactory innerFactory;
        MessageEncoder encoder;
        CompressionAlgorithm compressionAlgorithm;

        //The GZip encoder wraps an inner encoder
        //We require a factory to be passed in that will create this inner encoder
        public MyCompressionMessageEncoderFactory(MessageEncoderFactory messageEncoderFactory, 
            CompressionAlgorithm compressionAlgorithm)
        {
            if (messageEncoderFactory == null)
                throw new ArgumentNullException("messageEncoderFactory", "A valid message encoder factory must be passed to the CompressionEncoder");
            encoder = new MyCompressionMessageEncoder(messageEncoderFactory.Encoder, 
                compressionAlgorithm);
            this.compressionAlgorithm = compressionAlgorithm;
            this.innerFactory = messageEncoderFactory;
        }

        //The service framework uses this property to obtain an encoder from this encoder factory
        public override MessageEncoder Encoder
        {
            get { return encoder; }
        }

        public override MessageVersion MessageVersion
        {
            get { return encoder.MessageVersion; }
        }

        public override MessageEncoder CreateSessionEncoder()
        {
            return new MyCompressionMessageEncoder(this.innerFactory.CreateSessionEncoder(), 
                this.compressionAlgorithm);
        }

        //This is the actual GZip encoder
        class MyCompressionMessageEncoder : MessageEncoder { ... }
    }
}

The next step is to work on the MessageEncoder itself.

 class MyCompressionMessageEncoder : MessageEncoder
{
    const string GZipContentType = "application/x-gzip";
    const string DeflateContentType = "application/x-deflate";

    //This implementation wraps an inner encoder that actually converts a WCF Message
    //into textual XML, binary XML or some other format. This implementation then compresses the results.
    //The opposite happens when reading messages.
    //This member stores this inner encoder.
    MessageEncoder innerEncoder;

    CompressionAlgorithm compressionAlgorithm;

    //We require an inner encoder to be supplied (see comment above)
    internal MyCompressionMessageEncoder(MessageEncoder messageEncoder, 
        CompressionAlgorithm compressionAlgorithm)
        : base()
    {
        if (messageEncoder == null)
            throw new ArgumentNullException("messageEncoder", "A valid message encoder must be passed to the CompressionEncoder");
        innerEncoder = messageEncoder;
        this.compressionAlgorithm = compressionAlgorithm;
    }

    public override string ContentType
    {
        get 
        {
            return this.compressionAlgorithm == CompressionAlgorithm.GZip ?
                GZipContentType : DeflateContentType;
        }
    }

    public override string MediaType
    {
        get { return this.ContentType; }
    }

Notice that there are now two different content types defined. When the client first contacts the service, it will send a content type and the service must approve this content type. We can use this to prevent people from mixing compression algorithms. If you were to use GZip on the service and Deflate on the client, you will get a ProtocolException with the following message:

Content Type application/x-deflate was not supported by service net.tcp://localhost:9009/samples/CompressionEncoder. The client and service bindings may be mismatched.

This is a much better error message than what you would get otherwise. Not using content type would give the client an exception indicating a closed channel or something else equally vague.

The static methods that compress and decompress buffered messages are the next target.

 static ArraySegment<byte> CompressBuffer(ArraySegment<byte> buffer, BufferManager bufferManager, 
    int messageOffset, CompressionAlgorithm compressionAlgorithm)
{
    MemoryStream memoryStream = new MemoryStream();
    
    using (Stream compressedStream = compressionAlgorithm == CompressionAlgorithm.GZip ? 
        (Stream)new GZipStream(memoryStream, CompressionMode.Compress, true) :
        (Stream)new DeflateStream(memoryStream, CompressionMode.Compress, true))
    {
        compressedStream.Write(buffer.Array, buffer.Offset, buffer.Count);
    }
    ...
}

static ArraySegment<byte> DecompressBuffer(ArraySegment<byte> buffer, BufferManager bufferManager, 
    CompressionAlgorithm compressionAlgorithm)
{
    MemoryStream memoryStream = new MemoryStream(buffer.Array, buffer.Offset, buffer.Count);
    MemoryStream decompressedStream = new MemoryStream();
    int totalRead = 0;
    int blockSize = 1024;
    byte[] tempBuffer = bufferManager.TakeBuffer(blockSize);
    using (Stream compressedStream = compressionAlgorithm == CompressionAlgorithm.GZip ?
        (Stream)new GZipStream(memoryStream, CompressionMode.Decompress) :
        (Stream)new DeflateStream(memoryStream, CompressionMode.Decompress))
    {
        while (true)
        {
            int bytesRead = compressedStream.Read(tempBuffer, 0, blockSize);

Here I'm using the ternary operator a lot just to make things short and sweet. Notice that the GZipStream and DeflateStream have to be explicitly cast as Stream in order for the ternary operator to work.

The next bits to edit are the ReadMessage and WriteMessage methods. These should be self-explanatory. They'll need the CompressionAlgorithm passed in to the buffered message overloads. The overloads that use Stream objects could get that same ternary operator treatment as used in the static methods above.

At this point, there is only one thing left to do and that is to enable the CompressionAlgorithm switch in the WCF configuration. This is done in the BindingElementExtensionElement implementation. Find the MyCompressionMessageEncodingElement class. Add the property and edit the ApplyConfiguration method as shown below.

 [ConfigurationProperty("compressionAlgorithm", DefaultValue = CompressionAlgorithm.GZip)]
public CompressionAlgorithm CompressionAlgorithm
{
    get { return (CompressionAlgorithm)base["compressionAlgorithm"]; }
    set { base["compressionAlgorithm"] = value; }
}

//Called by the WCF to apply the configuration settings (the property above) to the binding element
public override void ApplyConfiguration(BindingElement bindingElement)
{
    MyCompressionMessageEncodingBindingElement binding = 
        (MyCompressionMessageEncodingBindingElement)bindingElement;
    PropertyInformationCollection propertyInfo = this.ElementInformation.Properties;
    if (propertyInfo["innerMessageEncoding"].ValueOrigin != PropertyValueOrigin.Default)
    {
        switch (this.InnerMessageEncoding)
        {
            case "textMessageEncoding":
                binding.InnerMessageEncodingBindingElement = 
                    new TextMessageEncodingBindingElement();
                break;
            case "binaryMessageEncoding":
                binding.InnerMessageEncodingBindingElement = 
                    new BinaryMessageEncodingBindingElement();
                break;
        }
    }
    if (propertyInfo["compressionAlgorithm"].ValueOrigin != PropertyValueOrigin.Default)
    {
        binding.CompressionAlgorithm = 
            (CompressionAlgorithm)propertyInfo["compressionAlgorithm"].Value;
    }
}

Now everything is finally ready to make adjustments to app.config files and try out the new Deflate option. Just look for the binding element and add the compressionAlgorithm attribute:

<MyCompressionMessageEncoding innerMessageEncoding="textMessageEncoding" compressionAlgorithm="GZip"/>

Run the sample code using Deflate and you'll see diagnostics output similar to the following:

 Original size: 758
Compressed size: 573
Original size: 426
Compressed size: 357
Original size: 2721
Compressed size: 855
Original size: 2382
Compressed size: 652

If you compare this to the numbers from GZip, the Deflate should be only a few bytes smaller. This makes perfect sense as GZip is the same as Deflate except that it adds a header, footer, and CRC.

This was a long post with a lot of code. The result though is a better platform to make a more generic compression encoder and hopefully some insight into what all this code does. Future posts in this series will expand on this platform.

CompressionEncoder.zip