Fun with LINQ and Distinct()

Had a weird scenario that I saw in some code today.  A weird structure was being used that we wanted to query for distinct items across a list of lists.  Lemme explain.

Imagine a list of items represented as an IEnumerable<SomeType>.  Now, imagine a bunch of those contained within a generic List.  The result is a list of lists.

image

The signature of this list is:

 List<IEnumerable<KeyValuePair>>

Given this weird structure, find all the distinct values.  In the old days, I would have created a temporary list and iterated through each item in the outer list (the blue list), then each item in its contained list (the green list).  If the value was not in my temporary list, I would add it.  At the end of iterating all items, I would have a list of distinct items.  With LINQ and the Distinct() operator, this is incredible easy to do. 

First, let’s define the class that holds one of the green squares.

 class KeyValuePair
    {
        public string Key { get; set; }
        public int Value { get; set; }
    }

Next, create a class the defines how to compare two of our objects.  This class implements the IEqualityComparer<YourTypeNameHere> interface.

     class MyComparer : IEqualityComparer<KeyValuePair>
    {
        public bool Equals(KeyValuePair x, KeyValuePair y)
        {
            return x.Value == y.Value;
        }

        public int GetHashCode(KeyValuePair obj)
        {
            return obj.Value.GetHashCode();
        }
    }

Now, let’s define and fill our weird structure, which is (from our picture above) the blue list of green lists.

 static List<IEnumerable<KeyValuePair>> FillList()
{
    List<IEnumerable<KeyValuePair>> weirdStructure = new List<IEnumerable<KeyValuePair>>();

    for (int i = 0; i < 5; i++)
    {                
        List<KeyValuePair> dictionary = new List<KeyValuePair>();
        Random r = new Random(3);

        for (int j = 0; j < 20; j++)
        {
            dictionary.Add(new KeyValuePair
            {
                Key = "Prop" + j.ToString(),
                Value = r.Next(0, 5)
            });
        }
        weirdStructure.Add(dictionary);
    }

    return weirdStructure;
}

Now, here’s the really cool part (and the point of this whole post).  How in the world can you query across all the items in all the green lists contained in the blue list?  Further, what if we want to get a list of distinct values across all of the lists?  It’s easy using the Distinct operator in LINQ.

 static void Main(string[] args)
{

    List<IEnumerable<KeyValuePair>> blueList = FillList();

    IEnumerable<KeyValuePair> items = (from greenList in blueList
                from greenItem in greenList
                select greenItem).Distinct(new MyComparer());

    foreach (KeyValuePair item in items)
    {
        Console.WriteLine(item.Key + "=" + item.Value);
    }
}

What would have been at least 10 lines of code to build a unique list of items turned into 1 line of LINQ code.

For More Information

SelectManyOperator – Hooked on LINQ

101 LINQ Samples

Use LINQ’s SelectMany Method to “Flatten” Collections