How to use LINQ methods to compare objects of custom types

LINQ provides a convenient syntax and many useful methods for operating with collections of objects. However, to be correctly processed by LINQ comparison methods such as Distinct or Intersect, a type must satisfy certain requirements.

Let’s take a look at the Distinct method, which returns all distinct objects from a collection.

List<int> numbers = new List<int> { 1, 1, 2, 3 };

var distinctNumbers = numbers.Distinct();

foreach (var number in distinctNumbers)

    Console.WriteLine(number);

 

The output is:

1

2

3

But what if you want to use the Distinct method for a collection of objects of your own type? For example, like this:

class Number

{

    public int Digital { get; set; }

    public String Textual { get; set; }

}

 

class Program

{

    static void Main(string[] args)

    {

       List<Number> numbers = new List<Number> {

           new Number { Digital = 1, Textual = "one" },

           new Number { Digital = 1, Textual = "one" } ,

           new Number { Digital = 2, Textual = "two" } ,

           new Number { Digital = 3, Textual = "three" } ,

           };

 

       var distinctNumbers = numbers.Distinct();

 

       foreach (var number in distinctNumbers)

                   Console.WriteLine(number.Digital);

    }

}

The code compiles, but the output is different:

1

1

2

3

Why did that happen? The answer is in the LINQ implementation details. To be correctly processed by the Distinct method, a type must implement the IEquatable<T> interface and provide its own Equals and GetHashCode methods.

So, the Number class from the previous example should actually look like this:

class Number: IEquatable<Number>

{

    public int Digital { get; set; }

    public String Textual { get; set; }

 

    public bool Equals(Number other)

    {

 

        // Check whether the compared object is null.

        if (Object.ReferenceEquals(other, null)) return false;

 

        // Check whether the compared object references the same data.

        if (Object.ReferenceEquals(this, other)) return true;

 

        // Check whether the objects’ properties are equal.

        return Digital.Equals(other.Digital) &&

               Textual.Equals(other.Textual);

    }

 

    // If Equals returns true for a pair of objects,

    // GetHashCode must return the same value for these objects.

 

    public override int GetHashCode()

    {

 

        // Get the hash code for the Textual field if it is not null.

        int hashTextual = Textual == null ? 0 : Textual.GetHashCode();

 

        // Get the hash code for the Digital field.

        int hashDigital = Digital.GetHashCode();

 

        // Calculate the hash code for the object.

        return hashDigital ^ hashTextual;

    }

}

But what if you cannot modify the type? What if it was provided by a library and you have no way of implementing the IEquatable<T> interface in this type? The answer is to create your own equality comparer and pass it as a parameter to the Distinct method.

The equality comparer must implement the IEqualityComparer<T> interface and, again, provide GetHashCode and Equals methods.

Here is how the equality comparer for the original Number class might look:

class NumberComparer : IEqualityComparer<Number>

{

    public bool Equals(Number x, Number y)

   {

        if (Object.ReferenceEquals(x, y)) return true;

 

        if (Object.ReferenceEquals(x, null) ||

            Object.ReferenceEquals(y, null))

                return false;

 

            return x.Digital == y.Digital && x.Textual == y.Textual;

    }

 

    public int GetHashCode(Number number)

    {

        if (Object.ReferenceEquals(number, null)) return 0;

 

        int hashTextual = number.Textual == null

            ? 0 : number.Textual.GetHashCode();

 

        int hashDigital = number.Digital.GetHashCode();

 

        return hashTextual ^ hashDigital;

    }

}

 

And don’t forget to pass the comparer to the Distinct method:

var distinctNumbers = numbers.Distinct(new NumberComparer());

Of course, these rules don't just apply to the Distinct method. For example, the same is true for the Contains, Except, Intersect, and Union methods. In general, if you see that a LINQ method has an overload that accepts the IEqualityComparer<T> parameter, it probably means that to use it for your own data types you need to either implement IEquatable<T> in your class or create your own equality comparer.

[author: Alexandra Rusina, Programming Writer]

Comments

  • Anonymous
    March 25, 2009
    PingBack from http://www.anith.com/?p=22649

  • Anonymous
    March 25, 2009
    Thank you for submitting this cool story - Trackback from DotNetShoutout

  • Anonymous
    March 26, 2009
    Another great article! Thanks Alexandra!

  • Anonymous
    November 02, 2009
    After having a go it seems that implementing Object.Equals and GetHashCode is enough and implementing IEquatable<T> is not required.

  • Anonymous
    December 02, 2009
    I'm using Microsoft Visual C# 2008 Express Edition and I'm trying to access a database where the table owner is NOT dbo.  All the LINQ code wants to use .dbo.  Can this be changed?

  • Anonymous
    December 03, 2009
    @ Steve Please, use one of the MSDN forums for Express editions to ask this question: http://social.msdn.microsoft.com/Forums/en-US/category/vsexpress You are much more likely to get an answer on the forum than on the blog.

  • Anonymous
    December 08, 2009
    similarly, you can implement IEqualityComparer<T> for any type.

  • Anonymous
    March 05, 2010
    I had translate this article to chinese. 我已将此文章翻译成中文: http://www.cnblogs.com/tianfan/archive/2010/03/06/how-to-use-linq-methods-to-compare-objects-of-custom-types.html

  • Anonymous
    December 23, 2010
    This is really nice one. once I also faced problem for the same, I implemented it through loops. but this is great idea. Thank you.

  • Anonymous
    July 26, 2011
    In the class NumberComparer, at the end of the article, there's a bug in the implementation of Equals: when the Textual fields are checked for equality, it should be String.Equals(x.Textual, y.Textual). Otherwise they might fail the reference check, but still be equal strings.

  • Anonymous
    March 27, 2014
    I've got running LINQ Contains() method over my classes and implementing of  IEquatable<T> didn't help, but overwriting old school Equal(object) method worked great. Hence, have no idea why we need that IEquatable<T> implementation.

  • Anonymous
    August 19, 2015
    I knew I should implement IEquatable but I forgot about the GetHashCode. Thank you!