Compressing messages in WCF part one - Fixing the GZipMessageEncoder bug

The compression options for WCF out of the box are limited in .Net 4.0. However, a sample is provided for GZip compression that shows you how to write your own MessageEncoder that can wrap the output of another encoder and apply GZip to the messages. If your environment has a network bandwidth limitation, compressing the messages going across the wire could be very helpful. In this series, we will be taking a look at how to use the GZip message encoder and what effect it has on your performance.

Download the WCF/WF Samples from here: https://www.microsoft.com/downloads/en/details.aspx?FamilyID=35ec8682-d5fd-4bc3-a51a-d8ad115a8792&displaylang=en

The first thing to do is examine the code for GZipMessageEncoder itself. Let's open up the solution. Download and install the WCF/WF samples to the directory of your choice. Then navigate to the WCF/Extensibility/MessageEncoder/Compression/CS directory and open the solution. Right-click on the solution in the solution explorer pane and choose "Set Startup Projects". Choose the Multiple startup projects radio button and use the dropdown to change the client and service projects' actions to "Start". Then you should be able to hit F5. The service and client windows should come up and execute, exchanging a couple messages back and forth.

The GZipMessageEncoder works by using another encoder underneath. In the sample, buffered messages are used. This means that the entire message is stored in a single continguous byte[]. We can examine the effect of compression on the buffered message by altering the code a bit to write the sizes before and after compression. To do this, open the GZipMessageEncodeFactory.cs file. Navigate to the GZipMessageEncoder class and the WriteMessage method that returns an ArraySegment<byte>. Alter the code as shown below:

 //One of the two main entry points into the encoder. Called by WCF to encode a Message into a buffered byte array.
public override ArraySegment<byte> WriteMessage(Message message, int maxMessageSize, 
    BufferManager bufferManager, int messageOffset)
{
    //Use the inner encoder to encode a Message into a buffered byte array
    ArraySegment<byte> buffer = innerEncoder.WriteMessage(message, maxMessageSize, 
        bufferManager, 0);
    //Compress the resulting byte array
    System.Diagnostics.Debug.WriteLine("Original size: {0}", buffer.Count);
    buffer = CompressBuffer(buffer, bufferManager, messageOffset);
    System.Diagnostics.Debug.WriteLine("Compressed size: {0}", buffer.Count);
    return buffer;
}

This just writes to diagnostics the size of the buffer. Here we can see how well our messages are being compressed. Hit F5 again to run and then bring up the Output view window in Visual Studio. You should see something like this:

 Original size: 751
Compressed size: 1024
Original size: 426
Compressed size: 512
Original size: 2714
Compressed size: 1024
Original size: 2382
Compressed size: 1024

There are a couple problems here. First, it looks like small messages actually get bigger. Second, the compressed sizes are in exact powers of two.

The first problem could be explained somewhat by the second problem. Let's examine the CompressBuffer code to see if we can find out what's wrong.

 //Helper method to compress an array of bytes
static ArraySegment<byte> CompressBuffer(ArraySegment<byte> buffer, BufferManager bufferManager, 
    int messageOffset)
{
    MemoryStream memoryStream = new MemoryStream();
    
    using (GZipStream gzStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
    {
        gzStream.Write(buffer.Array, buffer.Offset, buffer.Count);
    }

    byte[] compressedBytes = memoryStream.ToArray();
    int totalLength = messageOffset + compressedBytes.Length;
    byte[] bufferedBytes = bufferManager.TakeBuffer(totalLength);

    Array.Copy(compressedBytes, 0, bufferedBytes, messageOffset, compressedBytes.Length);

    bufferManager.ReturnBuffer(buffer.Array);
    ArraySegment<byte> byteArray = new ArraySegment<byte>(bufferedBytes, messageOffset, 
        bufferedBytes.Length - messageOffset);

    return byteArray;
}

The highlighted portion above is what's causing our problem. The bufferedBytes variable is a buffer taken from the BufferManager. The BufferManager will give you a buffer that is at least as large as what you asked for, usually rounding up to the nearest power of two. This means that when we write bufferedBytes.Length as the number of bytes in the ArraySegment, we're not getting the correct number. Instead, replace bufferedBytes.Length - messageOffset with compressedBytes.Length. Run the test again to see the improvements:

 Original size: 751
Compressed size: 592
Original size: 426
Compressed size: 377
Original size: 2714
Compressed size: 874
Original size: 2382
Compressed size: 670

This looks much better! For those of you who are curious, I've already reported this bug to the samples team and it should be cleared up in the next release.