Building Tuple [Matt Ellis]


MSDN MagazineFor readers who are interested in the work that goes into designing a feature, I wrote an article for MSDN Magazine that appears in this month’s issue.  Check out CLR Inside Out: Building Tuple which introduces the new Tuple type as well as discusses the design work we did behind it.

I’d love to hear feedback on what you think about the article and if you’d like to see more behind the scenes design articles in the future.  I’d also love to answer any questions about the design or why we made the decisions we did.

Also, there is one change we are thinking about making between Beta 1 and Beta 2 around tuple.  In Beta 1 we have a factory method, Tuple.Create, which builds tuples and has some nice type inference properties.  In Beta 1 the overload of this method which takes eight arguments requires that the last element is a tuple and builds an extended tuple.  For example:

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8));

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8].

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8, 9));

Will build a nine element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8, 9]

For Beta 2 we hope to change this so that eight argument version of Tuple.Create always builds an eight element tuple.  In this case:

Tuple.Create(1, 2, 3, 4, 5, 6, 7, 8);

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8]

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8));

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, [8]].

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8, 9));

Will build an element tuple that looks like [1, 2, 3, 4, 5, 6, 7, [8, 9]]

If you want to build tuples with more than eight elements and your language doesn’t have special tuple syntax, you’ll have to use the Tuple constructors directly.  If we see lots of people doing this we’ll add more overloads to Tuple.Create.

Thanks for reading; I hope everyone enjoys the article and look forward to your questions and comments.  Cheers!

Comments (28)

  1. MichaelGG says:

    Yes! I liked this article, and love these types of articles. Understanding why something the is the way it is really helps in many cases. I’m not too thrilled about having 2- and 3-tuples be reference types, but if the F# team said it didn’t bother them, I guess that’s good enough for now…

    On the C# side, it’s a bit conflicting how we use them now, since we count on them being reference types for memory impact. But I guess we’ll just make specific structs or a StructTuple for those cases when needed.

    What I didn’t understand is why tuples don’t act like structs as far as comparison goes.

    (Tuple.Create(1, 2) != Tuple.Create(1, 2) // this is totally crazy.

    Really, that needs to be looked at again. It seems like a source of bugs and confusion for == not to work, but .Equals to work. It severely limits how we’d want to use Tuples in C# in general.

    Thanks!

  2. Steven says:

    This is a good change you made. From a usability perspective the old behavior is really confusing.

    It’s actually a pity that we need this factory method. It would be much nicer when C# and VB would support inference on constructors, but I understand the trouble we’ll get into with such a language feature.

  3. zproxy says:

    I’d really like to build tuples like this:

    var tuple2 = new { "element1", "element2" };

    var e1 = tuple.Item1;

    var e2 = tuple.Item2;

    The factory API could then be called behind the scenes.

  4. Alex O. says:

    In my opinion, the Beta 2 version of Tuple.Create() is better, since the outcome of the first one would really make me dumbstruck when first time using it.

    I am also really glad that you rejected the idea of naming the tuple properties with English numeral names – there are a lot of non-English folks out there working with .NET and expecting them to really understand the numeral word permutations is a risky bet.

    Why not use an indexed or a collection property instead of the individual ItemX properties?

  5. pminaev says:

    I’ve read the article.

    I still don’t like the idea of Tuple being a reference type. It means that there’s one more type for which I will have to do recurrent pointless null checks for no good reason. In my opinion, Tuple should be a canonical example of a type that is absolutely, clearly a value type and nothing else.

    In any case, overloading == and != for Tuple is a must regardless of whether it’s value or reference type. It’s a general rule of thumb when overriding Object.Equals (don’t C# compiler warns you about this?), it is a clear indication to the user that type has value semantics, and other BCL and FCL reference-but-really-value types do it (e.g. System.Uri, or System.Xml.Linq.XName).

  6. commongenius says:

    I was very surprised to read in the article that Tuple will be a reference type. The article focused on the performance aspects, and concluded that there was no significant performance loss to making it a reference type. My question is, why would you want it to be a reference type in the first place? As the Framework Design Guidelines point out, and MS DevDiv members such as Eric Lippert have blogged about repeatedly, the choice between value type and reference type should be about semantics, not implementation, since implementations change. Surely Tuple fits the semantics of a value type better than those of a reference type? MichaelGG’s example illustrates that perfectly. I am forced to ask: if Tuple is not a value type, what is? Why do value types exist at all?

    "…we were unable to find compelling reasons for Tuple to implement interfaces like IEquatable<T> and IComparable<T>, even though it overrides Equals and implements IComparable."

    I would think that the compelling reason is the one that was just stated: the type already implements IComparable, so for consistency it should implement IComparable<T>. Again, semantics and usability should be the primary design goal, not performance. Implementing the generic interface should not be rejected unless it can be proven that doing so will cause hard performance goals to not be met. This is what Microsoft designers have been preaching for years. The article gives the impression that the decision to not implement the generic interfaces was made out of fear, not after testing the actual implementation against pre-defined metrics. Perhaps the article gave the wrong impression?

  7. @Alex O

    The reason we didn’t use an indexed or collection property is because the only sensable return type for that would be Object, so you’d lose the nice type safe properties of Tuple when you pulled items back out.

  8. @MichaelGG,

    Regarding the Value vs. Reference type decision, as I pointed out in the article, we did consider a split design where two and three element tuples would be value types, but the rest would be reference types, but there was strong pushback from the language teams about that due to the confusing semantic issues.

    @MichaelGG, @pminaev,

    With respect to overloading == and !=, There’s a comment in the design guidelines that addresses this:

    Section 8.10.2 which deals with Equality operators on Reference Types.  In my book copy it says to Consider not overloading equality operators on reference types, even if you override Equals or implement IEquatable<T> and avoid doing overloading the operators if the implementation would be significantly slower than that of reference equality.

    On the the relevent MSDN Page[1] it says: "Most languages do provide a default implementation of the equality operator (==) for reference types. Therefore, you should use care when implementing the equality operator (==) on reference types. Most reference types, even those that implement the Equals method, should not override the equality operator (==)."

    Now perhaps it makes sense to break this guideline if we want to the type to feel more like a value type.  I’ll discuss this issue with the team and see if we want to make that change.

    [1]: http://msdn.microsoft.com/en-us/library/7h9bszxx.aspx

  9. MichaelGG says:

    Re: Null — yea, it sucks. But null sucks in general, and it’s something we just have to deal with. I don’t care as much since I’m mostly using F#, so a lot of these issues don’t bother me.

    I agree that having a split where 2- and 3-tuples are structs, and larger are reference types would probably be more hassle and confusion than its worth.

    Don Syme wrote a bit about why F# tuples are reference types. Part of the explanation was that passing around large tuples could be costly when making function calls.

    (Supposedly in the future, F# will have a quick way to specify value-type records/tuples.)

    Honestly, I think it’d just be fantastic if the JIT knew about tuples and could selectively do smart things with them (like the F# compiler appears to always decompose arguments that are tuples).

    From a C# perspective, the non-null and equality semantics seem really, really wierd. I’m unaware of anyone who thinks of a tuple as a reference type semantically — it’s a sequence of values! Please, please reconsider this part of the design. That design guideline page also says:

    "Consider implementing operator overloading for the equality (==), not equal (!=), less than (<), and greater than (>) operators when you implement IComparable."

    As well as:

    "Override the equality operator (==) if your type is a base type such as a Point, String, BigNumber, and so on" — I’d say Tuple is about as basic a type you get…

    I also didn’t really understand why the only non-generic interfaces would be implemented.

    Anyways, thanks for discussing this and publishing details!

  10. Matt Ellis says:

    My feedback: I had to check that a) I didn’t actually work for Microsoft, b) I hadn’t designed and built a Tuple class and c) I hadn’t written an article for msdn.

    It took a moment, but I got there.

    Matt "not that one" Ellis

  11. pminaev says:

    I understand the reasoning behind making it a reference type now, thank you.

    The reason why Tuple should get an overloaded operator== regardless is simply because the default reference-eqiality version is nonsensical for a Tuple (as it is for any other immutable reference type that represents a value). There’s absolutely no benefit in being able to determine that two Tuple variables reference the same instance – there’s nothing useful you can derive from that information. I would again like to point out classes such as Uri, XName and XNamespace that do that correctly. If FDG does not cover this case, it is a fault in FDG.

    As a side note, anonymous classes in C# do not redefine operator== to mean structural equality, even though they do redefine Object.Equals. However, when I created a Connect ticket about it (https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=349014), Mads Torgersen replied that, in retrospect, they should indeed have made it so, even though they cannot change it now for back-compat reasons. Please don’t fall into the same trap! 😉

  12. Good article, I like this kind of in-depth article.

    I agree with the other commenters – Tuple should be a struct, and Tuple.Create(1, 42) == Tuple.Create(1, 42) should be true.

  13. Alex O. says:

    C# 3.0 added anonymous types that have a pretty straightforward way of defining class member names and use type inference, plus it does not have a built-in limitation on the number of members that can be specified.

    So, instead of creating a tuple as yet another language abstraction, why not just enhance the anonymous type mechanism by allowing the anonymous types to expose the required interfaces (e.g. IStructuralComparer etc.) and add ability to pass instances of such types around as strongly typed entities?

    Cheers,

    Alex

  14. pminaev says:

    Alex, that (passing around anonymous types) would require structural type equivalence if you want to be able to do that seamlessly between assemblies. And CLR is very much centered around the notion of nominal typing, though NoPIA is a very restricted form of structural typing (which anonymous classes won’t be able to reuse). So this would require a fairly major change to the CLR to implement.

  15. commongenius says:

    "So this would require a fairly major change to the CLR to implement."

    Unless anonymous types in C# used Tuple underneath, which is essentially what F# already does, and would make a lot of sense. Unfortunately it would probably be a breaking change at this point.

  16. Robert Bullen says:

    Question 1

    ———-

    If Tuples are reference types, do they take advantage of inheritence? For example, does the 3-Tuple class derive from 2-Tuple, thereby inheriting properties Item1 and Item2  and introducing only Item3? In other words, is the following true: 3-Tuple is-a 2-Tuple?

    This feature would allow the following usage:

    public void DoSomething(Tuple<int, int> 2dPoint) { … }

    var 3dPoint = new Tuple<int, int, int>(1, 2, 3);

    DoSomething(3dPoint);

    Question 2

    ———-

    If Tuples are immutable, do they decorate their type arguments as covariant (out)?

    This feature would allow the following usage:

    public class Base { … }

    public class Derived : Base { … }

    public void DoSomething(Tuple<Base, Base> baseTuple) { … }

    var derivedTuple = new Tuple<Derived, Derived>(new Derived(), new Derived());

    DoSomething(derivedTuple);

    Not that there would be high demand for either feature. Although the first feature could reduce memory pressure by avoiding the need to create subset tuple instances. And the second feature could be worked around by using generic methods with constraints like this:

    public void DoSomething<T, T>(Tuple<T, T> tuple) where T : Base { … }

    At any rate, I’m still curious: were these ideas considered and why where they accepted/rejected?

  17. @Robert Bullen

    Regarding your first point.  I don’t think we thought much about that case, but it is interesting feedback.  I wonder, however, if you’re finding you need to do this a lot if it makes sense to make your data enclosed by a first class type instead of using Tuple.

    Regarding the second point.  We actually did think a lot about doing this.  However, we currently only support co and contra variance on interfaces and delegates and we untimately felt it wasn’t worth it to include a coresponding set of ITuple interfaces right now.  We are thinking about variance when we design new data structures, it just didn’t seem worth the extra level of plumbing right now.

  18. pminaev says:

    @commongenius:

    > Unless anonymous types in C# used Tuple underneath, which is essentially what F# already does, and would make a lot of sense. Unfortunately it would probably be a breaking change at this point.

    But that would make two anonymous objects with different field names assignment-compatible and comparable so long as they have field count and types. I.e.:

    new { a = 1, b = 2 } Equals new { c = 1, d = 2 }

    Which is probably more relaxed than is desired.

    Furthermore, as I understand, even today the compiler has to do some field reordering sometimes to ensure things like:

     new { a = 1, b = 2} Equals new { b = 2, a = 1 }

    Obviously it has to generate a single class for both instances (because it is required by the spec so long as both instances are in source code that’s being compiled together for a single output module/assembly), so it will have to pick either "a,b" or "b,a" as a definite ordering. And this may actually happen in different places; i.e.:

      // file1.cs

      object foo() { return new { a = 1, b = 2 }; }

      // file2.cs

      object bar() { return new { b = 1, a = 2}; }

      // file3.cs

      foo().Equals(bar())

    If it will instead try to map them onto tuples itself, it will have the same issue (of deciding on some definite ordering), which means that comparisons between two anonymous instances with different field names will become completely unpredictable:

     new { a = 1, b = 2 } ? new { c = 1, d = 2 }

    The above now depends on the ordering of a/b and c/d, which may be defined by some other code elsewhere.

  19. Hello BCL Team!

    Could you discuss the interaction between features like this one and the new collectible dynamic assemblies feature (AssemblyBuilderAccess.RunAndCollect)?

    Specifically, if I dynamically emit a type definition T which would now be eligible for garbage collection, and then make and use instances of, say, Tuple<int, string, T, string, T>, will T still be eligible for garbage collection when all instances of T and all other references to T (including generic types using T, and instances of those generic types, …) are gone?

    Does the same answer apply if T is a value type? I know this changes the generic specialization process under the hood, or at least used to.

    Thanks!

  20. J says:

    Matt,

    Please reconsider the Equals/== decision. Is it because you don’t know what == will do for each element of the Tuple?

  21. An update on the equals operator and value equality:

    First, thanks everyone for your feedback on this issue.  It’s always great to get this level of feedback from all of you.

    We’ve spent the past week talking with architects from the C#, VB and F# teams around adding an equals operator to Tuple and after a lot of discussion on both sides of the issue, we’ve decided not to do this.  I’ll do my best to explain the reasoning here and answer any questions you might have.

    In general, the .Equals() method is intrinsic to the type, while the equals operator is very much tied to the language.  For most brand new types, the distinction isn’t necessary to make.  But for a tuple, which can contain other types that already have special equality semantics in a language, the story gets much more complicated.

    In the end, we decided that we can’t enforce a semantics on the equals operator unless the semantics is one that behaves as expected from any language.  

    Originally we thought that it would make sense for the equals operator to just the Equals method, but it turns out this leads to a slightly bizarre semantics (at least in C#) where you have something like this:

    Double.NaN == Double.NaN -> False

    Tuple.Create(Double.NaN).Equals(Tuple.Create(Double.NaN)) -> True (Since Double.NaN.Equals(Double.NaN) is true)

    Some languages also have a different operator semantics for = with Strings.  For example, in VB the empty string and null compare as equals when compared with = and can sometimes coerce strings representing numbers (like “5”) into numbers themselves, so you can have “5” = 5.  What happens when you wrap these things in Tuples?  Do you still get the correct behavior?

    We’ve decided that languages and not the BCL team should decide what operators do unless we have a very good reason to think that we can get the correct semantics across all languages.  In this case, we don’t think we can, so we won’t be adding operators.

  22. MichaelGG says:

    Thanks for explaining it all, Matt. It makes a bit more sense now, from the BCL’s point of view.

    So, the bigger question is, will C# get fixed to do the right thing? Or will people using tuples in C# get hit with the completely illogical tuple(1,2) != tuple(1,2)?

  23. @MichaelGG,

    I can’t comment on the plans of the C# team.  During our internal discussions about this issue the point was raised that C# could do something here in their compiler to give semantics that C# developers would want, but I’m not sure what their release plans are and don’t want to commit them to anything.

    My recomendation would be to file an issue with connect asking them to do this.

  24. commongenius says:

    I can understand the difficulty, but I am still very concerned that the semantics of Tuple equality will be completely unintuitive. I simply can’t imagine a case where Tuple.Create(1, 2) == Tuple.Create(1, 2) should return false! This is just setting up developers for failure, and I urge you to reconsider what you can do to mitigate this problem before .NET 4.0 is released.

    Of course, if Tuples were value types instead of reference types, this wouldn’t be an issue. After all, the entire point of having value types is that semantically they have value equality, which is exactly what we want for Tuples. I still have not seen an explanation of why Tuples are references types to begin with.

  25. ovidiupl says:

    @Douglas McClean, RE: Collectible dynamic assemblies (AssemblyBuilderAccess.RunAndCollect)

    Tuple is a generic type. Collectible assemblies interact with Tuple in the same way they interact with generics in general: You’re free to mix and match.

    You can have generic types and methods in collectible and regular (non-collectible) assemblies, and you can instantiate them with types defined in either collectible or non-collectible assemblies.

    The lifetime rules for these instantiations are the same as for the collectible types involved, and they keep alive the assembly involved where applicable. (By "collectible type" I mean "type defined in a collectible assembly"). The assembly will be collected when nothing refers to it anymore (essentially object instances, Reflection.Emit objects referring to that assembly, method frames on thread stacks). This works for value types too.

    Please let us know via Connect if you run into bugs, or even things that contradict your intuition. I’ll try to put together a few blog posts for the CLR team blog in the near future.

    Ovidiu Platon, developer, CLR type system team

  26. @Ovidiu Platon,

    Please see feedback item 476776 at https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=476776 for detailed repro steps of the question I am trying to ask.

    If you’d like, I have repro code available which I was unable to post to Connect for whatever reason, I’d be happy to share it by email.

  27. ovidiupl says:

    Thank you for taking the time to report this, Douglas. I was able to reproduce the problem you mention with a fairly simple program. I’ll post updates on this bug on Connect. Please feel free to post your repro code directly in the bug description if attaching doesn’t work.

    Ovidiu Platon, developer, CLR type system team

  28. pminaev says:

    If operator== is not well-defined for Tuple, then at the very least I’d expect the compiler to complain (at least a warning), and the implementation to be there but unconditionally throw. Anything else is a recipe for submarine bugs.

    As a side note, if Tuple was a value type to begin with, it wouldn’t have any default operator== (just like KeyValuePair does not), and thus this wouldn’t be an issue.