LINQ Farm: More on Set Operators

This is a second post on the LINQ Set operators, the first being published while LINQ was still in beta. As mentioned in the previous post, there are four LINQ set operators: Union, Intersect, Distinct and Except. Like the other 49 LINQ operators, these methods are designed to allow you to query data which supports the IEnumerable<T> interface. Since all LINQ query expressions, and most LINQ queries, return IEnumerable<T>, these operators are designed to allow you to perform set operations on the results of a LINQ query.

In this post I give four highly simplified examples of how to use each of the operators, and then end with a more complex example that shows how the operators might be used in a real world setting.

Download the source code.

Union

The Union operator shows the unique items from two lists, as shown in listing 1.

Listing 1: The Show Union method displays the number 1, 2, 3, 4, 5 and 6.

 public void ShowUnion()
{
    var listA = Enumerable.Range(1, 3);
    var listB = new List<int> { 3, 4, 5, 6 };

    var listC = listA.Union(listB);

    foreach (var item in listC)
    {
        Console.WriteLine(item);
    }
} 

Here two collections are joined together, but only the unique members of each list are retained.

Intersect

The Intersect operator shows the items that two lists have in common.

Listing 2: The ShowIntersect method displays the numbers 3 and 4

 public void ShowIntersect()
{
    var listA = Enumerable.Range(1, 4);
    var listB = new List<int> { 3, 4, 5, 6 };

    var listC = listA.Intersect(listB);

    foreach (var item in listC)
    {
        Console.WriteLine(item);
    }
}       

Here to collections are joined together, and only the unique, shared members of each list are retained.

Distinct

The Distinct operator finds all the unique items in a list.

Listing 3: The ShowDistinct method displays the number 1, 2 and 3.

 public void ShowDistinct() 
{
    var listA = new List<int> { 1, 2, 3, 3, 2, 1 };
    var listB = listA.Distinct();

    foreach (var item in listB)
    {
        Console.WriteLine(item);
    }
}

Except

The Except operator shows all the items in one list minus the items in a second list.

Listing 4: The ShowExcept method prints out the numbers 1, 2, 5, and 6

 public void ShowExcept() 
{
    var listA = Enumerable.Range(1, 6);
    var listB = new List<int> { 3, 4 };

    var listC = listA.Except(listB);

    foreach (var item in listC)
    {
        Console.WriteLine(item);
    }
}

In the Context of LINQ

The type of code listed above is useful, but it might be helpful to see these same operators used in the context of a LINQ query expression. You can then see how they can be used to analyze the results of queries to better understand the data that is returned.

You probably know that there are two similar collections used to create lists. One is the generic List<T> collection and the other is the old-style collection called ArrayList. We can use set operators to help us better understand the difference between these two classes.

Here are two queries retrieving the methods from the List<int> class and the ArrayList class:

 var queryList = from m in typeof(List<int>).GetMethods()
                where m.DeclaringType == typeof(List<int>)
                group m by m.Name into g
                select g.Key;

var queryArray = from m in typeof(ArrayList).GetMethods()
                 where m.DeclaringType == typeof(ArrayList)
                 group m by m.Name into g
                 select g.Key;

Here is code to retrieve the interesection of these two lists:

 var listIntersect = queryList.Intersect(queryArray);

And here is code that displays the resulting sequence:

 Console.WriteLine("Count: {0}", listIntersect.Count());
 foreach (var item in listIntersect)
{
    Console.WriteLine(item);
}

Alternatively, you could write the query like this:

 var queryList = (from m in typeof(List<int>).GetMethods()
                 where m.DeclaringType == typeof(List<int>)
                 group m by m.Name into g
                 select g.Key).Intersect(from m in typeof(ArrayList).GetMethods()
                                         where m.DeclaringType == typeof(ArrayList)
                                         group m by m.Name into g
                                         select g.Key);

In either case, the following list would be displayed:

get_Capacity
set_Capacity
get_Count
get_Item
set_Item
Add
AddRange
BinarySearch
Clear
Contains
CopyTo
GetEnumerator
GetRange
IndexOf
Insert
InsertRange
LastIndexOf
Remove
RemoveAt
RemoveRange
Reverse
Sort
ToArray

And here is how to see the items that the generic lists supports that are not part of the old style collection:

 var listDifference = queryList.Except(listIntersect);

And here is the result of this query:

ConvertAll
AsReadOnly
Exists
Find
FindAll
FindIndex
FindLast
FindLastIndex
ForEach
RemoveAll
TrimExcess
TrueForAll

Now you have a list of the methods the two classes share in common, and a list showing what the new generic class has that is not part of the older collection. The LINQ set operators made it easy for you to discover this information.

Download the source code.

kick it on DotNetKicks.com