How to use LINQ methods to compare objects of custom types
LINQ provides a convenient syntax and many useful methods for operating with collections of objects. However, to be correctly processed by LINQ comparison methods such as Distinct or Intersect, a type must satisfy certain requirements.
Let’s take a look at the Distinct method, which returns all distinct objects from a collection.
List<int> numbers = new List<int> { 1, 1, 2, 3 };
var distinctNumbers = numbers.Distinct();
foreach (var number in distinctNumbers)
Console.WriteLine(number);
The output is:
1
2
3
But what if you want to use the Distinct method for a collection of objects of your own type? For example, like this:
class Number
{
public int Digital { get; set; }
public String Textual { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Number> numbers = new List<Number> {
new Number { Digital = 1, Textual = "one" },
new Number { Digital = 1, Textual = "one" } ,
new Number { Digital = 2, Textual = "two" } ,
new Number { Digital = 3, Textual = "three" } ,
};
var distinctNumbers = numbers.Distinct();
foreach (var number in distinctNumbers)
Console.WriteLine(number.Digital);
}
}
The code compiles, but the output is different:
1
1
2
3
Why did that happen? The answer is in the LINQ implementation details. To be correctly processed by the Distinct method, a type must implement the IEquatable<T> interface and provide its own Equals and GetHashCode methods.
So, the Number class from the previous example should actually look like this:
class Number: IEquatable<Number>
{
public int Digital { get; set; }
public String Textual { get; set; }
public bool Equals(Number other)
{
// Check whether the compared object is null.
if (Object.ReferenceEquals(other, null)) return false;
// Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, other)) return true;
// Check whether the objects’ properties are equal.
return Digital.Equals(other.Digital) &&
Textual.Equals(other.Textual);
}
// If Equals returns true for a pair of objects,
// GetHashCode must return the same value for these objects.
public override int GetHashCode()
{
// Get the hash code for the Textual field if it is not null.
int hashTextual = Textual == null ? 0 : Textual.GetHashCode();
// Get the hash code for the Digital field.
int hashDigital = Digital.GetHashCode();
// Calculate the hash code for the object.
return hashDigital ^ hashTextual;
}
}
But what if you cannot modify the type? What if it was provided by a library and you have no way of implementing the IEquatable<T> interface in this type? The answer is to create your own equality comparer and pass it as a parameter to the Distinct method.
The equality comparer must implement the IEqualityComparer<T> interface and, again, provide GetHashCode and Equals methods.
Here is how the equality comparer for the original Number class might look:
class NumberComparer : IEqualityComparer<Number>
{
public bool Equals(Number x, Number y)
{
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) ||
Object.ReferenceEquals(y, null))
return false;
return x.Digital == y.Digital && x.Textual == y.Textual;
}
public int GetHashCode(Number number)
{
if (Object.ReferenceEquals(number, null)) return 0;
int hashTextual = number.Textual == null
? 0 : number.Textual.GetHashCode();
int hashDigital = number.Digital.GetHashCode();
return hashTextual ^ hashDigital;
}
}
And don’t forget to pass the comparer to the Distinct method:
var distinctNumbers = numbers.Distinct(new NumberComparer());
Of course, these rules don't just apply to the Distinct method. For example, the same is true for the Contains, Except, Intersect, and Union methods. In general, if you see that a LINQ method has an overload that accepts the IEqualityComparer<T> parameter, it probably means that to use it for your own data types you need to either implement IEquatable<T> in your class or create your own equality comparer.
[author: Alexandra Rusina, Programming Writer]
Anonymous
March 25, 2009
PingBack from http://www.anith.com/?p=22649Anonymous
March 25, 2009
Thank you for submitting this cool story - Trackback from DotNetShoutoutAnonymous
March 26, 2009
Another great article! Thanks Alexandra!Anonymous
November 02, 2009
After having a go it seems that implementing Object.Equals and GetHashCode is enough and implementing IEquatable<T> is not required.Anonymous
December 02, 2009
I'm using Microsoft Visual C# 2008 Express Edition and I'm trying to access a database where the table owner is NOT dbo. All the LINQ code wants to use .dbo. Can this be changed?Anonymous
December 03, 2009
@ Steve Please, use one of the MSDN forums for Express editions to ask this question: http://social.msdn.microsoft.com/Forums/en-US/category/vsexpress You are much more likely to get an answer on the forum than on the blog.Anonymous
December 08, 2009
similarly, you can implement IEqualityComparer<T> for any type.Anonymous
March 05, 2010
I had translate this article to chinese. 我已将此文章翻译成中文: http://www.cnblogs.com/tianfan/archive/2010/03/06/how-to-use-linq-methods-to-compare-objects-of-custom-types.htmlAnonymous
December 23, 2010
This is really nice one. once I also faced problem for the same, I implemented it through loops. but this is great idea. Thank you.Anonymous
July 26, 2011
In the class NumberComparer, at the end of the article, there's a bug in the implementation of Equals: when the Textual fields are checked for equality, it should be String.Equals(x.Textual, y.Textual). Otherwise they might fail the reference check, but still be equal strings.Anonymous
March 27, 2014
I've got running LINQ Contains() method over my classes and implementing of IEquatable<T> didn't help, but overwriting old school Equal(object) method worked great. Hence, have no idea why we need that IEquatable<T> implementation.Anonymous
August 19, 2015
I knew I should implement IEquatable but I forgot about the GetHashCode. Thank you!