We just shipped an update to our experimental implementation of a multi value dictionary. In this post, our software developer intern Ian Hays talks about the changes. — Immo
Goodbye MultiDictionary
In my last post I went over MultiDictionary
, officially available on NuGet as the prerelease package Microsoft.Experimental.Collections. We received great feedback, questions and commentary in the comments, and it was clear that this was something that a lot of you felt passionately about (70 comments? Awesome!). We’ve read all of your comments and taken them into consideration for this next iteration of Microsoft.Experimental.Collections
.
You should also check out our interview on Channel 9:
Hello MultiValueDictionary
First off, let’s talk about the name. It was a bit ambiguous what the “Multi” in “MultiDictionary” referred to: at first glance, “multi” could mean there were multiple keys per value, or a dictionary of dictionaries, or that it was a bi-directional dictionary. To make it explicit and leave room for other variants in the future, we’ve renamed the type to MultiValueDictionary
to clarify that the type allows multiple values for a single key.
Let’s get right to the meat of the post: what’s changed? We’ll go into some of the major design decisions and changes that make up the new MultiValueDictionary
in the next sections.
IEnumerable of…?
MultiDictionary
could be thought of as Dictionary<TKey, TValue>
where we could have multiple elements with the same TKey
. MultiValueDictionary
is more akin to a Dictionary<TKey, IReadOnlyCollection<TValue>>
with a number of methods to enable easy modification of the internal IReadOnlyCollections
. This distinction may seem subtle, but it affects how you consume the data structure.
For example, let’s look at the Count and Values properties. MultiDictionary
would return the number of values and a collection of values, while MultiValueDictionary
returns the number of keys and a collection of IReadOnlyCollections
of values.
// MultiDictionary
var multiDictionary = new MultiDictionary<string, int>();
multiDictionary.Add("key", 1);
multiDictionary.Add("key", 2);
//multiDictionary.Count == 2
//multiDictionary.Values contains elements [1,2]
// MultiValueDictionary
var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 2);
//multiValueDictionary.Count == 1
//multiValueDictionary.Values contains elements [[1,2]]
This behavioral change also affects the enumerator in the same way that it affects the Values
property. Previously the dictionary was flattened when enumerating, as it implemented IEnumerable<KeyValuePair<TKey, TValue>>
. MultiValueDictionary
now implements IEnumerable<KeyValuePair<TKey, IReadOnlyCollection<TValue>>
.
var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 2);
multiValueDictionary.Add("anotherKey", 3);
foreach (KeyValuePair<string, IReadOnlyCollection<int>> key in multiValueDictionary)
{
foreach (int value in key.Value)
{
Console.WriteLine("{0}, {1}", key.Key, value);
}
}
// key, 1
// key, 2
// anotherKey, 3
As Sinix pointed out in the previous blog post comments, this is very similar to another type in the .NET Framework, ILookup<TKey, TValue>
. MultiValueDictionary
shouldn’t implement both the dictionary and lookup interfaces, because that would cause it through interface inheritance to implement two different versions of IEnumerable
: IEnumerable<KeyValuePair<TKey, IReadOnlyCollection<TValue>>
and IEnumerable<IGrouping<TKey, TValue>
. It wouldn’t be clear which version you would get when using foreach
. But since MultiValueDictionary
logically implements the concept, we’ve added a method AsLookup()
to MultiValueDictionary
which returns an implementation of the ILookup
interface.
var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 2);
multiValueDictionary.Add("anotherKey", 3);
var lookup = multiValueDictionary.AsLookup();
foreach (IGrouping<string, int> group in lookup)
{
foreach (int value in group)
{
Console.WriteLine("{0}, {1}", group.Key, value);
}
}
// key, 1
// key, 2
// anotherKey, 3
Indexing and TryGetValue
In the first iteration of the MultiDictionary
we followed the precedent from Linq’s AsLookup()
with regards to the way the indexation into the MultiDictionary
worked. In a regular Dictionary
, if you attempt to index into a key that isn’t present you’ll get a KeyNotFoundException
, but like AsLookup()
, the MultiDictionary
returned an empty list instead. This was mostly to match the functionality of the Lookup
class that is conceptually similar to the MultiDictionary
, but also because this behavior was more practically applicable to the kinds of things you’d be using the MultiDictionary
.
With the behavior changes brought on by the MultiValueDictionary
and the addition of the AsLookup()
method, this old functionality doesn’t quite fit anymore. We heard feedback that this inconsistency between MultiDictionary
and Dictionary
was confusing, so the MultiValueDictionary
will now throw a KeyNotFoundException
when indexing on a key that isn’t present. We’ve also added a TryGetValue
method to accommodate the new behavior.
var multiValueDictionary = new MultiValueDictionary<string, int>();
multiValueDictionary.Add("key", 1);
//multiValueDictionary["notkey"] throws a KeyNotFoundException
IReadOnlyCollection<int> collection = multiValueDictionary["key"];
multiValueDictionary.Add("key", 2);
//collection contains values [1,2]
Another related change with the MultiValueDictionary
on the topic of the indexer is the return value. Previously we returned a mutable ICollection<TValue>
. Adding and removing values from the returned ICollection<TValue>
updated the MultiDictionary
. While there are uses for this functionality, it can be unexpected and create unintentional coupling between parts of an application. To address this we’ve changed the return type to IReadOnlyCollection<TValue>
. The read-only collection will still update with changes to the MultiValueDictionary
.
When a List just doesn’t cut it
One limitation of the MultiDictionary
was that internally, it used a Dictionary<TKey, List<TValue>>
and there was no way to change the inner collection type. With the MultiValueDictionary
we’ve added the ability to specify your own inner collection.
Showing a simple example of how they work is probably easier than trying to describe them first, so let’s do that.
var multiValueDictionary = MultiValueDictionary<string, int>.Create<HashSet<int>>();
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 1);
//multiDictionary["key"].Count == 1
Above, we replace the default List<TValue>
with a HashSet<TValue>
. As the examples show, HashSet
combines duplicate TValues
.
For every constructor there is a parallel generic static Create
method that takes the same parameters but allows specification of the interior collection type. It’s important to point out that this doesn’t affect the return value of the indexer/TryValueValue
though (they return very limited IReadOnlyCollections
regardless of the inner collection type).
If you want a little bit more control over how your custom collection is instantiated, there are also the more specific Create
methods that allow you to pass a delegate to specify the inner collection type:
var multiValueDictionary = MultiValueDictionary<string, int>.Create<HashSet<int>>(myHashSetFactory);
multiValueDictionary.Add("key", 1);
multiValueDictionary.Add("key", 1);
//multiValueDictionary["key"].Count == 1
In either case, the specified collection type must implement ICollection<TValue>
and must not have IsReadOnly
set to true by default.
And that’s all!
You can download the new MultiValueDictionary
from NuGet and try it out for yourself! If you have any questions or if you just want to give feedback, please leave a comment or contact us.
Thanks! It's surprisingly a lot of work to be done to make the collection usable as every 'simple' feature turns out to be not so easy:)
The new name is better, less confusing; but I'd still rather call that a Lookup than a Dictionary. It's a shame that the name Lookup<TKey, TValue> is already taken, because that would have been a good choice.
I don't like the new semantics. The goal of this class is to associate multiple values to a key. But in this new version, the public interface doesn't really reflect that intent; instead, it mostly reflects the internal implementation, i.e. the fact that it's basically a dictionary where the values are collections. IMO, this is wrong: when you ask "give me all the values associated with that key", you expect to receive a collection; if no values are associated with that key, you should receive a collection with 0 element, not an exception. The exception makes sense for Dictionary, because you expect exactly 1 value, and there is no other way to convey the fact that there is no such value; but when you expect multiple values, there is no reason to treat 0 as a special case: an empty collection is much more convenient to the caller than an exception. I could understand the will to make it consistent with Dictionary, but IMO it's more important to make it easy to use than to make it consistent with something that is only vaguely similar. And this class would definitely be easier to use as a Lookup than as a Dictionary…
(forgot that part in my previous post…)
The possibility to choose the type of collection for the values is a nice touch. However I think the factory delegate should take the key as a parameter; it doesn't cost anything, and it's more flexible.
I also do not like the KeyNotFoundException. When I read these comments on the MultiDictionary, I thought the guys did not really understand the purpose of this class. This is NOT a traditional dictionary so of cause it's semantics are different. I have implemented a very similar class like this myself so I know what I could use it for. Having to use TryGet is always a bit cumbersome with the current C# semantics (looking forward to C# vNext). It is much more convenient to just check the size of the returned collection.
I agree with Thomas KeyNotFoundException is stupid. Exceptions are for… exceptional situations. A key not being present in a dictionary is a pretty standard scenario.
Awesome stuff!
Looks good, except for the KeyNotFoundException, which will cause unwieldy code only for the sake of consistency with an only slightly related type. It would be a lot simpler to just return an empty collection if the key isn't found
I like the changes, even the 'KeyNotFoundException'. To me trying to retrieve a value for a key that was never stored is an exception/unanticipated situation in code. This behavior is also inline with my expectations of a Dictionary (be it regular or MultiValueDictionary). I assume we can use the ContainsKey and TryGetValue methods like we did with a Dictionary where key not present is a check.
I welcome the new name, I had always felt that MultiDictionary was ambiguous when it was introduced.
Overall good changes, Cheers.
Not a fan of the KeyNotFoundException from a productivity stand point. In most cases you will be doing a for each on the values for a given key when consuming the MultiDictionary. Doing so will now require you to first do a TryGetValue, then check that it returned True, store the returned value and then iterate. I've been using PowerCollections' MultiDictionary for 10 years so maybe I'm biased but I think simplicity of use is more important than "being like a dictionary".
The create method that allows you to specify the type/factory for the ValueCollection is neat though.
I agree that it should not throw a KeyNotFoundException. 100% agree with Thomas Levesque's arguments. Like Guy Godin, I'm a user of PowerCollection's MultiDictionary which doesn't throw an exception and this behavior is exactly what I needed in the past.
@Chris Marisic, a value in a dictionary without a key is common? This IS an exception! Think about a real dictionary, where you have a definition of something without its name.
KeyNotFoundException should be thrown for sure.
I miss a copy constructror. It would be nice to copy a MultiValueDictionary like this:
var asd = new MultiValueDictionary<string, int>(oldMultiValueDictionary);
It also works for List<> and Dictionary<,>.
Keep it simple => vote for removal of KeyNotFoundException
I really like the new datatype.
However I am missing an easy way to initialize the MultiValueDictionary using object initializers.
With an dictionary it possible to use the following syntax:
private Dictionary<int, int[]> graph = new Dictionary<int, int[]>()
{
{1, new []{2,3,4}}
};
private MultiValueDictionary<int, int> graph = new MultiValueDictionary<int, int>()
{
//not possible?
};
Any suggestions?
With the latest update, it's not worth to use MultiValueDictionary, you still need to add validations on keys not found, why not use Dictionary<T, List<Y>> instead? no extra dependencies needed. And it's easier to create Dictionary<T, Stack<Y>> without MultiValueDictionary.
If the KeyNotFoundException is necessary to keep consistency on IDictionary, why not remove the IDictionary interface declaration? I though the reason behing MultiValueDictionary was to have a simple to use way to manage multiple values associated with one key, the code to add and read values should be clean and easy to use. I don't know, OneToMany<string, int>?
any one from ms can comment on the latest feedback/questions that are posted?
Is the source code of this class available somewhere? I could need it right now in my project, but it's .NET 4.0 and I wouldn't want to use Nuget or even add another DLL just for this class. Otherwise I'd need to implement it myself (maybe inspired by your anouncement) and repeat your work already done.
Hi, when using this in a helper in asp.net mvc like this
@helper fn(MultiValueDictionary<t,t>p1,int idx)
I get the same error as described here http://www.lyalin.com/2014/04/25/the-type-system-object-is-defined-in-an-assembly-that-is-not-reference-mvc-pcl-issue/ although the fix breaks multivaluedictionary
Will it be a part of the next .NET Framework release? Because it's needed.
What I currently have to do (in C#) to retrieve values from a MultiValueDictionary:
public ObservableCollection<Word> SelectWords(string languageName)
{
ObservableCollection<Word> words = new ObservableCollection<Word>();
// if none on database, check Globals >> code excluded for brevity<<
if (!(words.Count() > 0))
{
// get this language's words from the globals multidictionary
try
{
var myWords = Globals.AvailableWords[languageName];
// consume readonly collection if key found
foreach (var item in myWords)
{
words.Add(item);
}
}
catch (Exception e)
{
; // no need to do anything as exception thrown if languageName (key) not in multivalueDictionary
}
}
return words;
}
What I'd like to do, replace the above method with just one line 🙂
return words = Globals.AvailableWords[languageName] as ObservableCollection<Word>;
There were no updates since 10 months now, and the version on nuget.org is still "alpha". Has MultiValueDictionary been cancelled? Will there be any updates?
@Martin
Sorry for the long silence. Since last year, the primary focus of our team has been to bring up the open source and cross-plat .NET Core stack. We've prioritized existing and widely used APIs over adding new features. I'm happy to say that Ian is now back as a full-time employee, so you can expect some more movement in this area.
I don't want to promise any time lines as the priorities are still the same. There is a lot of work left to finish porting functionality into an open engineering system as well as providing a fully cross-platform stack.
@Immo Thank you!
Will this be added to .Net Core?
Ah,
I see it’s now part of CoreFX labs: https://github.com/dotnet/corefxlab