The Compare Contract [Kim Hamilton]

A breaking change?

We recently heard from a customer who observed different sorting behavior in .NET FX 3.5 SP1 compared to 3.5 RTM.

The different behavior was demonstrated with the following code. The class StringWrapper provided a custom sort in which nulls (null StringWrapper references) were moved to the end of the array. To achieve this, StringWrapper implemented IComparable<StringWrapper> and in its CompareTo method, nulls were always greater than non-nulls.

 public class MyClass {
    public static void Main() {
        StringWrapper a = new StringWrapper();
        a.Value = "a";
        StringWrapper b = new StringWrapper();
        b.Value = "b";
        StringWrapper c = new StringWrapper();
        c.Value = "c";
        StringWrapper d = null;

        Console.WriteLine("Sort 1:");
        StringWrapper[] src1 = new StringWrapper[] { a, c, d, b };
        Array.Sort(src1);
        PrintStringWrappers(src1); // print elements, method included at end
    }
}

public class StringWrapper : IComparable<StringWrapper> {
    private string _value;
    public string Value {
        get { return _value; }
        set { _value = value; }
    }

    // Recall that CompareTo returns:
    // <0 if this object is less than other
    //  0 if this object is equal to other
    // >0 if this object is greater than other
    public int CompareTo(StringWrapper other) {
        if (other == null) return -1; // nulls are greater than any non-null
        return Value.CompareTo(other.Value);
    }

    public override string ToString() {
        return Value;
    }
}

This custom comparison apparently worked in .NET FX 3.5 RTM, but not .NET FX 3.5 SP1.

3.5 RTM output:

 a
b
c
null

3.5 SP1 output:

 a
b
null
c

Did the custom comparer really work?

The custom comparer worked in that example, in which only one of the array elements was null. Let’s throw in another null and see what happens:

 Console.WriteLine("Sort 2:");
StringWrapper[] src2 = new StringWrapper[] { a, d, d, c, b };
Array.Sort(src2);
PrintStringWrappers(src2);

3.5 RTM output:

 a
b
null
null
c

The problem is in the CompareTo method, but it isn’t obvious. The first line in CompareTo actually violates the IComparable<T>.CompareTo contract that any object compares greater than a null reference.

 public int CompareTo(StringWrapper other) {
    if (other == null) return -1; // violates CompareTo contract!
    return Value.CompareTo(other.Value);
}

The Compare Contract

IComparable<T>.CompareTo has the following requirements (from the MSDN docs)

For objects A, B, and C, the following must be true:

  • A.CompareTo(A) is required to return zero.
  • If A.CompareTo(B) returns zero, then B.CompareTo(A) is required to return zero.
  • If A.CompareTo(B) returns zero and B.CompareTo(C) returns zero, then A.CompareTo(C) is required to return zero.
  • If A.CompareTo(B) returns a value other than zero, then B.CompareTo(A) is required to return a value of the opposite sign.
  • If A.CompareTo(B) returns a value x that is not equal to zero, and B.CompareTo(C) returns a value y of the same sign as x, then A.CompareTo(C) is required to return a value of the same sign as x and y.

By definition, any object compares greater than a null reference (Nothing in Visual Basic), and two null references compare equal to each other.

The requirement that any object compares greater than null is a bit of a footnote at the end, so it makes sense that this may not be well known (we should highlight this more).

Still, why did the behavior change? (Gory details and a 3.5 RTM performance bug)

The high-level problem is that .NET’s sorting makes assumptions based on the Compare contract, so in some cases sorting will special case null, because it “knows” your comparer will adhere to the contract and return values consistent with that assumption. If you don’t adhere to the contract, you’ll get bitten at some point.

The details are messier. Over the years, a variety of changes were made to improve the performance of Sort. There was a brief window (released in 3.5 RTM) where, during QuickSort, the swap and pivot steps were broken, and in intermediate steps, it would actually unsort certain already sorted arrays (the array had to contain  null). The end result would be correct, but because the elements were incorrectly swapped, sorting took much longer than it should.

We fixed this bug in SP1, and this fix caused the different behavior between RTM and SP1.

Comply with Compare contract

The only way to ensure stability across versions is to comply with the Compare contract. Otherwise you fall prey to implementation quirks (or even bugs) in the runtime sort implementation. We’d like to be free to change it between releases, because we’d like to keep improving the performance!

Enough preaching, what are my options?

The interesting thing is that the StringWrapper implementation gets you mostly there. The apparent goal of the wrapper is to implement a special sort of strings that pushes nulls to the end. You can do what you want if you create a StringWrapper with null values, as follows:

 public int CompareTo(StringWrapper other) {
    if (other == null) return 1;
    if (Value == null && other.Value == null) return 0;
    if (Value == null) return 1;
    if (other.Value == null) return -1;
    return Value.CompareTo(other.Value);
}

This actually obeys the compare contract and won’t be brittle to changes in our sort implementation.

Full Source

 using System;
using System.Collections.Generic;

public class MyClass {
    public static void Main() {

        StringWrapper a = new StringWrapper();
        a.Value = "a";
        StringWrapper b = new StringWrapper();
        b.Value = "b";
        StringWrapper c = new StringWrapper();
        c.Value = "c";
        StringWrapper d = null;
        StringWrapper e = new StringWrapper();
        e.Value = "e";
        StringWrapper f = new StringWrapper();
        f.Value = "f";


        Console.WriteLine("Sort 1:");
        StringWrapper[] src1 = new StringWrapper[] { a, c, d, b };
        Array.Sort(src1);
        PrintStringWrappers(src1);
        // .NET FX 3.5 RTM output: a b c null
        // .NET FX 3.5 SP1 output: a b null c

        Console.WriteLine("-----");
        Console.WriteLine("Sort 2:");
        StringWrapper[] src2 = new StringWrapper[] { a, d, d, c, b };
        Array.Sort(src2);
        PrintStringWrappers(src2);
        // .NET FX 3.5 RTM output: a b null null c
    }

    private static void PrintStringWrappers(StringWrapper[] swArray) {
        foreach (StringWrapper sw in swArray) {
            if (sw == null)
                Console.WriteLine("null");
            else
                Console.WriteLine(sw);
        }

    }
}

public class StringWrapper : IComparable<StringWrapper> {
    private string _value;
    public string Value {
        get { return _value; }
        set { _value = value; }
    }

    // original CompareTo -- violates Compare contract!
    public int CompareTo(StringWrapper other) {
        if (other == null) return -1;
        return Value.CompareTo(other.Value);
    }

    // alternate CompareTo, which obeys Compare contract and moves null Values to the end
    /*
    public int CompareTo(StringWrapper other)
    {
        if (other == null) return 1;
        if (Value == null && other.Value == null) return 0;
        if (Value == null) return 1;
        if (other.Value == null) return -1;
        return Value.CompareTo(other.Value);
    }
    */

    public override string ToString() {
        return Value;
    }
}