What Are The Semantics Of Multiple Implicitly Typed Declarations? Part Two

Many thanks for all your input in my informal poll yesterday. The results were similar to other “straw polls” we’ve done over the last couple of months. In this particular poll the results were:

var a=A, b=B; where the expressions are of different types should:

  • have the same semantics as var a=A; var b=B;: 12
  • replace the var with some type for both: 3
  • give an error: 6

There were 18 comments; a few people voted twice, which is fine with me.

The way the feature is specified is that the var is to be replaced with the best type compatible with all the expressions, to maintain the invariant that parallel declarations like this always give the same type to each variable. Many people that we’ve polled believe that this is the “intuitively obvious” choice, including much of the language design team. A larger group of language users believes that “infer each variable type separately” is the “intuitively obvious” choice.

So what to do? We have a relatively unimportant edge-case feature where customers strongly disagree as to what the code “obviously” means, and the difference can lead to subtle bugs. That’s clearly badness. Given this feedback, amply confirmed by you all, we are probably going to simply remove multiple implicitly typed declarations from the C# 3.0 language.

Thanks for your feedback!

Comments (26)

  1. Gabe says:

    Does this mean that you’re going with the "give an error" option or are you not allowing "var a=1,b=1"?

  2. EricLippert says:

    The plan right now is to disallow the whole thing.  That is, if var is being used as a contextual keyword, then you get one declaration per var, not a list of declarations per var.

  3. MarkP says:

    so even though

      int a=1, b;

    is semantically equivalent to

      int a=1; int b;

    the exact same form with var instead of int will be illegal? Is there any other case where replacing a fixed type name with "var" would cause a declaration error?

  4. EricLippert says:

    Since “var” is _only_ legal in a local variable declaration with an assignment, the answer to your question is “yes — all other cases are such cases”.

    Also, any local variable context in which the type of the expression cannot be determined will also fail.  For example, Func<int, int> f = c=>c+1; succeeds, var f = c=>c+1; fails because we have no idea what the desired type of the lambda is.


  5. Even though I voted for "equivalent to var a=A; var b=B;" I agree with your decision. I actually never use multiple declarations in a single line anyway so it won’t affect me in the slightest, and considering that there are clearly a large number of (weird) people whose intuition is backwards ;) it’s definitely better to disallow code that could be read ambiguously. This is similar I think to requiring break or goto at the end of each case in a switch statement – leaving it out would mean that the "obvious" meaning to a C/C++ programmer would be the exact opposite of the "obvious" meaning to everyone else :)

  6. Jonathan says:

    Why even alow the var type at all then?  Granted, I do not work with C# or any other C-derived, strongly-typed language, but it seems to me that the only purpose of it is to allow the declaration of variables without the programmer actually deciding what type of information they will hold…which is increadibly lazy and probably dangerous to some degree.

  7. EricLippert says:

    Two reasons. The not particularly good reason is that

    Dictionary<string, List<int>> mydict = new Dictionary<string, List<int>>();

    is somewhat redundant and gross looking.

    The really good reason is "because C# 3.0 will have _anonymous_ types".  Obviously if a type cannot be named then there is no way to declare a variable of that type without some kind of type inference.

  8. Jonathan says:

    So, then, what is the advantage of anonymous types?  I’m not trying to be a pain, by the way, I’m just curious.  As I said, I don’t work with C-derived languages. I only do scripting, where the declaration of variables is almost always optional anyway; so I don’t really understand the advantages/disadvantages of being required to define a type for a variable.

  9. EricLippert says:

    I will leave the enumeration of the advantages of static typing vs dynamic typing for another day.

    There are two main advantages of anonymous types.

    First, anonymous is generally goodness.  We already have "anonymous variables" in C#.  That is, you can write:

    a = b + c * d;

    See the anonymous variable in there?  Of course you don’t.  We are so used to anonymous variables that we don’t even see them anymore.  The C# compiler of course is actually generating the equivalent of

    temp = c * d;

    a = b + temp;

    C# 2.0, Jscript, etc, have anonymous methods, which is also handy.

    Anonymous types are just one more step in this direction.  You ought to be able to say "I want a name, age, phone number triplet" and have that be a statically typed entity without having to give that thing a name.

    Second, having anonymous types makes query comprehensions much easier to write:

    var results = from c in customers where c.City == "London" select new {c.Name, c.Age};

    Now suppose that we didn’t have anonymous types or type inferencing.  You’d have to write:

    internal class NameAndAge { private string name; private int age; internal string Name { get … blah blah blah, and then

    IEnumerable<NameAndAge> results = from c in customers where c.City == "London" select new NameAndAge(c.Name, c.Age);

    Now you decide that you want phone number in there as well and you have to define ANOTHER new type!  What a pain!  And then you have to update the type of results too.  

    The point of all of these new features is to make query comprehensions work _painlessly_ without giving up static typing.

  10. What bothers me about anonymous types is that there’s no way to pass them between methods. So you end up with entire chunks of code that are impossible to perform "extract method" refactoring on, because a variable that would need to be passed to or returned from the new method can’t be.

    I like anonymous types, but I’d like them much better if they were first-class types that could be used in any context. So suppose I have code like:

    foreach (var info in from p in people select new {p.Name, p.Age}) {

     // do lots of very long and complex processing here using info


    I could refactor this:

    foreach (var info in from p in people select new {p.Name, p.Age}) {



    void doProcessing(@{string Name, int Age} info) {

     // long and complex processing here


    The @{…} syntax is the new bit I’m proposing, of course. I think without something like this there’s a danger that anonymous types could really hurt long-term maintainability of code due to the inability to refactor.

    Also is my foreach line actually legal?

  11. EricLippert says:

    I would also prefer anonymous methods to be first-class. However, doing it right requires changing the CLR type system, not just the C# type system.  (The versioning issues are considerable.)

    Given that we’re not going to change the CLR type system, I am hoping that there are things we can do to make this work.  Suppose, for example, that refactoring to extract a method upon code that references an anonymous type caused a new nominal type to be emitted into your source code.  Would that make you feel better?

    I’m not _saying_ that we’re going to do that, of course.  But it’s definitely an idea that’s been kicking around here for a while. :-)

    (And sure, that foreach looks good to me.)

  12. I agree that doing anonymous types right requires changing the CLR type system (heh, which is why I keep asking which version that’s going to happen in ;) ).

    The problem of course is cross-assembly calls.

    I’m wondering if a sufficiently close approximation could be done by naming conventions along with a little help from the compiler.

    Suppose that @{string Name, int Age} compiled under the hood into a class called something like __anon__mscorlib__System_Int32__Age__mscorlib__System_String__Name (or with some otherwise illegal characters in there to make sure it was impossible to clash with any real name, but I don’t know what the CLR allows here. Also note I sorted the names into alphabetical order because @{int Age, string Name} should mean the same). Now make these types public as far as the CLR is concerned (they have no code that could be used as an attack surface so this is safe). And make them inherit from, I dunno, System.AnonType, and have the compiler disallow inheriting from that manually (much like ValueType).

    So suppose that assembly1 contains a method foo(@{string Name, int Age} val) and you want to call this from assembly2.

    The problem is that assembly1’s version of __anon__mscorlib__System…blahblahblah is different from assembly2’s, so you’re passing the wrong thing. BUT the compiler knows this at compile time, and also knows that the types in question are anonymous because of the AnonType inheritance. So when it encounters an attempt to call a method with the wrong parameter types, but the only thing wrong is that they’re both anonymous types with the exact same name from two different assemblies, instead of emitting a compile error, it emits a call to a method instead (perhaps inside AnonType), and that method is declared as "T2 Convert<T1, T2>(T1 val) where T1 : AnonType where T2 : AnonType". This doesn’t of course only apply in the context of method calls, the same thing could be done as what amounts to an implicit cast operator in both directions, to cover all cases.

    That method could be implemented using reflection – and without even any privileged code, since the properties in question are public.

    And furthermore with a suitable enhancement to the CLR’s type system in the future I’m *fairly* sure that all this under-the-hood crap could be done away with without any programmer-visible backward-incompatible changes.

    Am I missing anything fundamental?

  13. EricLippert says:

    There are length restrictions on how long a type name can be, so we’d have to use some kind of crypto strong hash to shorten that down.

    But modulo that, yes, that’s the kind of crazy magic that we could do.

    An alternate approach would be to have some kind of standard "loosely typed property bag" type for import and export that could be converted to/from the appropriate strongly typed tuple.

    There are any number of ways to skin this particular problem.  However we want this release to have as many new features as necessary to make query comprehensions work, but no more.  We the C# team also do not want to take dependencies on changes in the CLR if we can possibly avoid it.

  14. Gabe says:

    What sort of changes would be required for making anonymous types first class? What’s the difference between a local variable and one that can be returned from or passed to a function?

  15. EricLippert says:

    Adding new stuff to the CLR type system has a major impact on all languages. For example, all languages are now required to be able to talk to generic types if they want to be CLI-compliant languages.  That’s a major burden on language implementors and we do not take imposing it lightly.  

    The difference between a local and something returned, in this case, is that a local never escapes into any context in which its type can be part of a publically visible contract.  If a method could be ‘var’ then we’d either have to say "only private/internal methods can be var", which is gross, or come up with some standardized, versionable, secure, safe way to represent public methods that return anonymous types.

    By keeping anonymous types restricted to being used only inside contexts in which they cannot "leak out" we don’t have to solve any of those hard problems.  They can be solved in future versions of the CLR.  We’ve got to ship this thing! If we wait for the type system to be perfect, we’ll wait forever.

  16. I think that’s the best thing to do, nobody can agree on what the compiler should do, means most developers will make many mistakes and a question asked a thousand times in the forums

    and this is not a big deal (to not have it either way) anyway

  17. Fabrice says:

    I’m with the "disallow the whole thing" or "error" option. When there is such a question as no one really knows what happens, remove the ambiguity and disallow the case to happen. This can save a lot of time and trouble.

  18. dimkaz says:

    Eric would allow var declaration for out parameters?

  19. EricLippert says:

    Nope.  Just local variable declarations.  Out parameters could leak information about anonymous types to the outside world.

  20. dimkaz says:

    How are they going to leak. The function is declared with a given type, it only at the call site that I am asking.

    For example given

    int GetSomething(out bool alsoReturn)

    To be able to call

    var a = GetSomething(var out also);

    Would delcare also to be bool.

  21. Ryan Phelps says:

    I finally purchased The Design and Evolution of C++, and it appears that Dr. Stroustrup would agree with you.  In section he says, "I would probably also ban uninitialized variables and abandon the idea of declaring more than one name in a declaration."

    I suspect this may have more to do with declarations of the sort:

    int *a, b, *c[10];

    The declaration semantics of C# are simpler than C/C++ due to the lack of pointers, so it may not be as much of a concern, but it is nonetheless an interesting point.

  22. EricLippert says:

    First off, hey buddy, C# does too have pointers!  It’s perfectly legal to declare a pointer to char, int, etc in C#.  That’s what the "unsafe" keyword is for, so that you can create incredibly dangerous programs that crash in the same horrible ways that your unmanaged C++ programs used to.  (Or, because they’ll likely corrupt the garbage collector, new horrible ways.)

    Second, the declaration semantics of C# are simpler because of two things:

    * the lack of const

    * the more consistent syntax for describing a type

    The latter is the one you’re getting at.  In C#, all the modifiers to make a base type into an array, a pointer, etc, are actually part of the type. A careful reading of the C++ standard shows that C++ can’t even get it self-consistent.  As you note, in a local variable declaration, the * is part of the variable, not part of the type.  But in a formal parameter list, the * is part of the type.

  23. Ryan Phelps says:

    Ahem…  A thousand apologies.  I’ve read some C# books, but never delved into it too very deeply.  I was applying my Java knowledge to C#, with results as above.  I do remember pointers in unsafe code, but foolishly chose to forget them.

    I see the point of the more consistent declaration syntax now; I had mentally glossed over the difference between:

    int[] a;

    int a[];

    I like const.  What makes it difficult?  Or is it things like:

    char const * const a; //same as (?): const char * const a;

    char const * b;  //same as (?): const char * b;

    char * const c;

    that make it difficult because you can have non-const pointers to const data, const pointers to non-const data, and const pointers to const data and you have to keep track of all that?  To say nothing of pointers to pointers to etc. and the subsequent explosion of const combinations.

    Lastly, I enjoy your blog immensely.  It makes me hope that one day I get to work on cool stuff like this, although it’ll obviously take me quite a while to get up to speed on my declaration knowledge :-)

  24. Tanveer Badar says:

    Eric, there is one mistake. Do forgive me for only pointing out your mistakes. :)

    ‘Adding new stuff to the CLR type system has a major impact on all languages. For example, all languages are now required to be able to talk to generic types if they want to be CLI-compliant languages.  That’s a major burden on language implementors and we do not take imposing it lightly.’

    Try putting a CLSCompliant( true ) on assembly level. Defining a public generic type/method in C# emits a not CLS-compliant warning.

  25. Raymond has an interesting post today about two subtle aspects of C#: how order of evaluation in an expression

  26. Greg says:

    Tanveer, CLS & CLI are two different  things.