Covariance and Contravariance in C#, Part Nine: Breaking Changes


Today in the last entry in my ongoing saga of covariance and contravariance I’ll discuss what breaking changes adding this feature might cause.

Simply adding variance awareness to the conversion rules should never cause any breaking change. However, the combination of adding variance to the conversion rules and making some types have variant parameters causes potential breaking changes.

People are generally smart enough to not write:

if (x is Animal)
  DoSomething();
else if (x is Giraffe)
  DoSomethingElse(); // never runs

because the second condition is entirely subsumed by the first. But today in C# 3.0 it is entirely sensible to write

if (x is IEnumerable<Animal>)
  DoSomething();
else if (x is IEnumerable<Giraffe>)
  DoSomethingElse();

because there did not used to be any conversion between IEnumerable<Animal> and IEnumerable<Giraffe>. If we turn on covariance in IEnumerable<T> and the compiled program containing the fragment uses the new library then its behaviour when given an IEnumerable<Giraffe> will change. The object will be assignable to IEnumerable<Animal>, and therefore the “is” will report “true”.

There is also the issue of existing source code changing semantics or turning compiling programs into erroneous programs. For example, overload resolution may now fail where it used to succeed. If we have:

interface IBar<T>{} // From some other assembly

void M(IBar<Tiger> x){}
void M(IBar<Giraffe> x){}
void M(object x) {}

IBar<Animal> y = whatever;
M(y);

Then overload resolution picks the object version today because it is the sole applicable choice. If we change the definition of IBar to

interface IBar<-T>{}

and recompile then we get an ambiguity error because now all three are applicable and there is no unique best choice.

We always want to avoid breaking changes if possible, but sometimes new features are sufficiently compelling and the breaks are sufficiently rare that it’s worth it. My intuition is that by turning on interface and delegate variance we would enable many more interesting scenarios than we would break.

What are your thoughts? Keep in mind that we expect that the vast majority of developers will never have to define the variance of a given type argument, but they may take advantage of variance frequently. Is it worth our while to invest time and energy in this sort of thing for a hypothetical future version of the language?

Comments (34)

  1. Peter Ritchie says:

    I think breaking scenarios would be rare.  And the scenarios where it does break probably makes a bug visible at compile time and arguably could have been written differently so it won’t be affected by these changes.

    I agree that a feature like this is sufficiently compelling to be worth the breaking changes.

    …otherwise, you could never introduce co-/contra-variance…

  2. I’m really conflicted. I think the feature is worthwhile to have, but I’m not sure that by itself it meets the -100 bar in terms of value versus added conceptual complexity. In combination with other variance-related features it becomes more of a no brainer, but I’ve played that broken record enough by now ;)

    Having said that, I don’t think the breaking changes by themselves are enough of a reason to not include this feature.

  3. onovotny says:

    I think that covariance of interfaces is one of the biggest missing features currently in C# — When trying to design elegant type-safe libraries, we currently have to revert back to having our interfaces derive from a non-generic version simply to put them in a collection…  This leads to lots of kludgy code that could be eliminated with variance.

  4. Daniel Grunwald says:

    I can imagine several cases where overload resolution might pick a different method or become ambiguous.

    E.g.

    class BaseClass {}

    class DerivedClass : BaseClass {}

    void Test(IEnumerable<BaseClass> a) {}

    void Test(IEnumerable b) {} // non-generic "fallback" method

    // call with:

    Test(new List<DerivedClass>());

    However, in the places where I’ve seen such things, it was always done to work around C# not supporting variance. The implementation of the non-generic method would often be

    void Test(IEnumerable b) { Test(b.Cast<BaseClass>()); }

  5. Alex Morris says:

    This is a fascinating look into language design, but I’m having a hard time coming up with real-world uses for variance.  Does anyone have a brief, real-world example of code that would be improved by variance?

  6. bleroy says:

    Do it. Most people won’t even see it’s there because it will just work as they consume classes and interfaces that used the feature. But the point is, today people see it’s *not* there when they hit problems.

    I think it’s the mark of a truly great feature when it simplifies your life without you even noticing.

    And if somehow you can spot the broken old code at compile-time, bonus points. Of course, if existing assemblies suddenly start breaking that could be quite puzzling to debug.

  7. Could the IDE perhaps spot this code during migration of the project to C#4? "This code will behave differently from prior versions because an IEnumerable<Giraffe> is now an IEnumerable<Animal>".

    Catching all the conceivable outcomes of an "is" check is probably halting-problem complete (and worse, because you may not even have access to the callers of a public API to know what types might be passed in) but it should certainly be possible to catch cases where overload resolution will have different results – just evaluate the overload resolution with and without taking variance into account, and if the results differ, flag the code for the user to examine.

  8. Tanveer Badar says:

    I think the feature should make into a future version of language. C++ has covariant return types for ages and that feature is useful for library designers most of the time.

    Your plan is much more ambitious. There will be huge hurdles even if it does make it past the -100 point mark. But on the other side, I have seen code written by inexperienced people who have never heard of virtual functions, love ‘as’ and ‘is’, in the end their code is littered with type casts.

  9. I, personally, would love to see variance implemented for generic delegates. It just seems wrong that I can’t assign a Func<string,object> (taking string and returning object) instance into a Func<object,string> (taking object and returning string) when I’d clearlybe allowed to make that call were the delegates not involved. To get round this problem I’ve written some delegate casting extension methods which wraps one delegate inside another but this is both clunky to use and has the overhead of one additonal and seeming unnecessary delegate invocation.

    Conversely, when trying to create a "once and for all" implementation of the VISITOR design pattern based around a dictionary of delegates (returning void and taking T and indexed by Type T) the easiest way for me to capture these was to wrap each delegate as a Delegate<object> so they all had the same type. If I could have created my dictionary to allow any Delegate<T> where T:object then this step wouldn’t have been necessary. I’d be happy to take the risk of an invalid parameter at runtime as I would have carefully controlled which delegate would be called under which circumstances in my code.

    Incidentally, my syntax preference was Luke’s delegate R Func<* is A, R is *> (A a). That just seems more intuitive to me and makes my brain hurt less! In the instance above that’d give me a Dictionary<Type,Proc<T is object>>. The question is, given that T will always be object would I need to explicitly specify it – possibly I would in order to ‘turn on’ the variance?

  10. Stefan Wenig says:

    I’d like to see it done. Imagine we would not have variance in arrays, how much casting and copying would we have to do? Now if I get the chance to have the same for IEnumerable (only without array covariance’s problems), that alone would be worth the trouble in the context of LINQ. I’ve missed variance before. (And I’ll probably keep missing it until you have it implemented for generic classes too, instead of just interfaces)

    Breaking changes might occur, I’d take the risk. Flagging the places in the source code for a one-time review, as Stuart suggested, is a good idea.

    Now here’s another suggestion: I though about how you could handle the problem of array covariance and came to think about the "const" keyword (which C# doesn’t have, and which is probably too hard to add now, but stay with me for a moment).

    Another option I was thinking about is an IReadOnlyList<T> interface. I’d have liked that before, because I’d find it more elegant that nesting some IsReadOnly property. But with covariance it would make so much more sense: I could get the (hypothetical) covariance of IEnumerable and the simpler/faster list access using an indexer, Count etc.

    How could those work together? Most functions that take arrays as parameters treat them as read-only constructs. Now changing their parameters from T[] to IEnumerable<T> would make everything slower, plus you’d have to rewrite your code. But changing them to IReadOnlyList<T> probably would do nothing bad to performance. We could even have IArray<T>, which could provide a "Length" property instead of Count. (Additional thinking required for n-dimensional arrays, but maybe interfaces for ranks 2 and 3 would be sufficient.)

    OK, I change all my "constant" T[] input parameters to IArray<T>. I might even use a tool that supports this. What next?

    I could turn on a compiler warning that jumps in my face every time I assign an Giraffe[] to a Animal[] variable or parameter! Assuming that arrays are either used as read-only references (in which case they should be declared as IArray<T>) or as modifiable array (in which case they should be considered invariant), this might fly.

    Admittedly, I haven’t thought that through. Anyway, this would be a nice opt-in for people who care about the problems of array covariance, pretty easy to implement in the compiler, and it should not affect anybody who just wants to compile their old code. On the cost side, you’d probably have to modify a lot of BCL method signatures to make this useful.

    When you propagate those warnings to errors and mark fully checked assemblies, the JIT might be able to skip type verification on assignments too. Althogh I worry less about this.

    (I’d also consider an invariant syntax for declaring array parameters and variables, like "invariant Animal[]", or alternatively, a modifiable interface like IModifiableArray; or call the covariant, constant version IReadOnlyArray and the invariant modifiable version IArray. Whatever.)

  11. Stefan Wenig says:

    4th paragraph, "that nesting some IsReadOnly property" should be "than testing".

  12. Stefan Wenig says:

    no, wait, the part about performance is wrong. arrays are treated natively by the JIT, which should be much faster than calling an indexer via a vtable. probably costs more than type checking on assignments…

    so we’d need "real" C++ like "const" support in order to make this usable in a general sense. which is unlikely.

    Have there been any discussions about introducing const in the CLR/C# lately? I know that many people think that const doesn’t pull its own weight in C++, and I tend to agree. But the CLR could enforce it, prevent casting-away of const in sandboxed scenarios via CAS policies, so this might be a really interesting security feature. Also, considering how functional programming favors immutable objects, this might make things easier for parallel processing (PLINQ).

  13. I’m tentatively for, though I’m worried that if the barrier to implementing your own reuseable types grows too high, it’ll actually reduce reuse as an unfortunate side-effect.  Extremely successful languages such as C don’t have such a huge difference in learning curve between consumers and users of "modules" (whether those are libraries, assemblies, interfaces…).  Still co/contravariance seems to promote reuse, and that’s a good thing, right?

  14. Timothy Bussmann says:

    I just got through reading this 9-part article series.  I haven’t read many MSDN articles or blogs, so I have to ask: what is the -100 test?  Aside from that, I like that each article is short – as this can be a very complex subject.

    I feel that the syntax would be far simpler, and far more familiar to simply use IFoo<+R, -A>.  It’s already like that in the CLR, and everyone who knows what variance is knows the +/- syntax.  However, I’d bet that everyone has to puzzle over the alternatives as they are all new.

    For anyone who’s been annoyed by C#’s lack of variance, adding it will be a major improvement.  For those who have no idea what variance is, I suspect they won’t be the least bit bothered by it.  I always thought IAction<Mammal> could be assigned to IAction<Animal> when the type parameter is an argument.  I was very confused the first time I tried compiling such code and C# stated the types were incompatible.  I had to stare at the error for a long time, then go ask someone why my code wouldn’t compile!  In short – the answer was: "While obvious, C# does not support variance.  You probably have no idea what variance is, but all you have to know is you can’t do that even though it seems like you should be able to.  I’ve been harping on Microsoft for years to fix this problem."

    Meaning, including variance should be the logical *default*, and not supporting variance seems more like the exception.  Experts hate the lack of variance, beginners are confused by the lack of variance (they have to "learn" that it is unsupported, as opposed to "learning" about how to use variance).

    All this discussion, and the current design of C# seems to imply that variance is this "fancy new feature".  I argue that the lack of variance support is an anti-feature — something the language designers went out of their way to annoy you!

    And finally, I think the +/- syntax is simple, intuitive for those who will write such code, and won’t get in the way.  It seems like something that was supposed to be there all along, (as proven by the CLR support), and C# just took it out to "baby" you.  Although really it’s just plain annoying, and this discussion wouldn’t even be happening if it was included to begin with.

  15. EricLippert says:

    "-100 points" refers to this article by former C# team member Eric Gunnerson:

    http://blogs.msdn.com/ericgu/archive/2004/01/12/57985.aspx

  16. EricLippert says:

    Re: Your plan is much more ambitious. There will be huge hurdles even if it does make it past the -100 point mark

    Actually, its just the opposite.  Variance on interfaces and delegates will be easy to implement because the CLR already supports it natively and has since generics were introduced. C# is just not taking advantage of it yet.  

    The CLR does NOT support variance on virtual overrides natively, so implementing that would require a lot more work on both the design and implementation side.

  17. Brian says:

    The lack of variance support is one of my top two complaints of C#.

    It is so lacking that I regularly need to drop down to the IL level to do work.

    Whidbey’s introduction of generics was great; however, it only went part way. Variance is absolutely needed, even if it may break some code in very rare cases.

    PS My number two complaint is the inability to declare generic overloaded operators because there is no way to know what is addable, subtractable etc.

  18. Welcome to the thirty-fifth edition of Community Convergence. This week we have an interview with C#

  19. Pop Catalin says:

    How about create a set of new assemblies version 3.0 (for CLR 3.0) that run side by side with the CRL 2.0  

    Or another option (this might seem crazy) but couldn’t CLR 3.0 just ignore variance info for ver. 2.0 assemblies and enable it for 3.0 assemblies and above ?

    Also variance has got to come sometime and I say better early than late. Variance it’s a feature that brings closer my other long desired feature for .Net: parametric polymorphisms using generics ( class Foo<T> : T {…} // :D ).

  20. Richard says:

    Separation of concerns. You have two changes you’d like to make:

    1) A language change, to allow variance to be defined for generics. Correct me if I’m wrong, but no current C# code is broken by this (if you’re using an assembly from another language which takes advantage of variance, I’m assuming you already get the variance behaviour in C#).

    2) A library change. In this post, you talk about a breaking change to IEnumerable, and a breaking change to a hypothetical IBar. Obviously this is badness. Breaking changes are badness almost by definition.

    Now, as far as I (a non-C#-using guy) am concerned, (1) is a good thing. As far as I can see, the only downsides are a) the -100 points, and b) the added cost to C# developers of understanding variance (and usually they won’t have to).

    But (2) is nowhere near as clear-cut. Contrary to what others have said, assuming my assumption above is valid, you can’t just say "this problem is trivially solvable by ruling that variance is off if it’s a CLR2 assembly" since this may break C# code which already uses generics with variance from other languages (including hand-coded MSIL, I guess). (2) has -100 points of its own, and I don’t see it getting the necessary +100 to justify it.

  21. Stefan Wenig says:

    Richard,

    IEnumerable is the foundation of LINQ. If it does not get covariance, we might as well have no variance for interfaces at all. Now, that’s an exaggeration, but just a small one. Introducing an new interface would conflict with C# features that support IEnumerable -> confusion.

    Having to cast between IEnumerable<T>’s of related types is a major PITA. We will start to feel the pain when we really use LINQ. Making it covariant would get my +100 and then some!

    BTW, C# currently ignores the CLR’s variance bits says Eric.

    Stefan

  22. 1) Make the change. The effects of the breaking change will be one time for code that’s migration. Without the change we’ll be forced to continue to fight the type system in some of the examples illustrated in this series and more.

    2) Go with the most intention revealing syntax possible. +/- is nice and it work well for me, but I suspect as you do that it would cause confusion among many. The more verbose options seem more universally clear.

  23. mbuzina says:

    I notice that you seem to mix 2 things when discussing the breaking change:

    1. Adding co- & contravariance itself does <b>not</b> break anything

    2. Changing existing classes / interface to include variance <b>is</b> a breaking change.

    So what to do? Add variance and restrain from changing existing interfaces. Just as the introduction of generics delivered a generic version of IENumerable, we now would get additionally a variance version of IENumerable…

  24. Stefan Wenig says:

    mbuzina,

    how would you call this new interface? IEnumerable2<T>? (recalling the horror of COM…)

    IEnumerable and IEnumerable<T> are easily separable (in fact, they have different names under the hood). IEnumerable<T> and IEnumerable<+T> are not. Creating two interfaces would also mean that you have to understand the difference going forward (i.e., everyone must understand covariance). It would make the language a great deal uglier.

    All this just to prevent a few breaking lines of code that can easily be fixed by bringing the tests in an order that would have been more logical in the first place? I don’t know…

    I say let’s have those bugs and fix them. Hopefully, they are very rare anyway!

  25. Stefan Wenig says:

    Now here’s the killer:

    if (x is IEnumerable<Animal>)

     DoSomething();

    else if (x is IEnumerable<Giraffe>)

     DoSomethingElse();

    This code is broken already!

    It is NOT, like Eric said, entirely sensible to write something like this in C# now. This will work as expected on generic collection classes, but not on arrays. So, if you write code like that, and test it only using collection classes, it will break the minute you feed it arrays. Because an Giraffe[] IS an Animal[], and therefore an IEnumerable<Animal> too, the Giraffe branch will never see execution for arrays.

    This way of testing IEnumerable is broken already. Any code depending on it is broken, unless it explicitly tests for arrays first! It might not result in visible errors in a certain application if it only gets collection classes. But this is a time bomb waiting to explode.

    Let’s get rid of this fast. In fact, I’d vote for a switch from IEnumerable<T> to IEnumerable<+T> with C# 3.0 (Although unfortunately, I assume it’s probably too late for this. Even if the change would just take minutes to implement, there’s always the thing about the light bulb…)

  26. Peter Ritchie says:

    Yeah, way too late considering they’ve committed to releasing C# 3.0 this month.

  27. Stefan Wenig says:

    Hey, so what, _I_ would do it ;-)

    Let’s go find that build server, ildasm, hack, ilasm, and sneak out of here. I’m sure Stuart Scott would have let me in :-D

  28. EricLippert says:

    * I actually do not know what our exact release schedule is. Our lead time for releases is so long that I stopped thinking about C# 3.0 weeks ago and have been thinking only about hypothetical future service packs and hypothetical future releases. C# 3.0 is done as far as I am concerned; I understand that, you know, _customers_ may not see it that way yet. :-)

    * We strongly considered flipping the variance bits on for IEnumerable/IEnumerator/IComparer/etc for the upcoming base class library release but ultimately decided that it would be better to update the libraries and compiler in lockstep to implement this feature. Were we to do so. Hypothetically.

    *  "if you’re using an assembly from another language which takes advantage of variance, I’m assuming you already get the variance behaviour in C#"

    That assumption is incorrect. In that case you will get variance behaviour everywhere that the C# compiler defers to the CLR, but never in situations where the C# compiler itself needs to make a decision about convertibility.

    Let me characterize the difference. If you have a class Foo which implements a covariant IFoo:

    object x = new Foo<Giraffe>();

    bool b = x is IFoo<Animal>;

    here the C# compiler simply generates a runtime check, and the CLR says sure, that thing is an IFoo<Animal>.

    But if you have

    void M(object f) {}

    void M(IFoo<Animal> f) {}

    M(new Foo<Giraffe>());

    then the C# compiler needs to decide at compile time which overload to call. Since the C# compiler does not know about variance, it will pick the object version today, even if Foo implements covariant IFoo.

  29. Stefan Wenig says:

    Eric, I never seriously considered it possible to do this with .NET 3.5. But what do you think about the variance problem we already have when we mix arrays with IEnumerable<T>? (As opposed to generic collections.) Would a breaking change really make this any worse? There are probably less than 5 people in the world who produced such code _and_ are aware that this behavess completely differently for List<T> and T[]…

  30. Jon Skeet says:

    I think there’s a bigger reason not to include variance than the fact that it might, in very rare cases, break some existing code:

    It makes the language more complicated.

    Even if I only *consume* variant code (e.g. things which publish IEnumerable<something which extends Foo> I still need to understand that. Now, at the moment we do already get people confused about why they can’t return a List<string> in a method returning IEnumerable<object>, but that’s going to be the case even when variance becomes an option.

    I agree it would be useful – but frankly I think even C# 3 is a complicated language to learn from scratch. I’d really like to see a post from Eric dedicated to this topic: "when do we stop?". I’d really welcome a gap of at least 3 years before C# 4 is released, just so everyone can get their heads round C# 3. It’s worth understanding that many, many developers don’t understand C# 2 yet, let alone C# 3.

    There are features I’d like to see in C# 4, certainly. Yeah, I’d love to have them *right now* in many ways. But we need to understand that developers already have a vast array of libraries to learn about (WPF, WCF, Silverlight, AJAX, LINQ etc). The balance between innovating and just overwhelming developers is a fine one, and my gut feeling is that MS have been on the "overwhelming" side for the last year or so.

    What’s the best way of being part of the discussion of such concerns, beyond comments in blogs?

  31. Stefan Wenig says:

    Jon

    I’d rather learn a few new language features than a heap of libraries and their individual ways of dealing with what’s left from the language. The language is at the core of what we’re doing, and a lot of people are dedicating way to little time to learning it (as opposed to learning IDEs and designers, frameworks and libraries, VS guidance automation stuff and VSTS, …)

    C# is the language for code-based productivity. (I’m sure I read that somewhere.) People who prefer the complexity in the stuff orbiting around the language are better served with VB.NET – the languages are finally beginning to actually enter their different Roadmaps instead of looking like the same language with two different syntax-skins (Sun liked to say that, and I’m glad it’s no longer true).

    For code-based productivity, you need powerful features. Sooner or later I hope we’ll even see some meta-programming or AOP mechanisms in C#. You really don’t have to understand them at every level just to benefit, most of the stuff ist just for sophisticated library developers anyway.

    Which is true for variance too, btw. You complain that it was hard to explain why a List<string> could not be returned as an IEnumerable<object>. Now what makes you think that it’s harder to explain why this is now possible? People who don’t like to think about stuff like that are not going to complain that they don’t get compiler errors anymore. Just like they don’t complain that covariance works for arrays now.

    How hard do you think it is to explain that while it won’t work for List<string>, you _can_ return a string[] as an IEnumerable<object> today, and make people not only understand the difference, but also be aware that they could write that works differently for arrays and for collection classes? Just bugs waiting to happen.

    And then I imagine this in the context of LINQ, where most of the people will have no idea of what a from/select statement is transformed into, have no real idea of how IEnumerable/IQueryable, extension methods and Lambdas work together to actually compile this, but will find it to "just work" most of the time. Except when they run across the missing covariance of IEnumerable, that is.

  32. While it seems to me all the focus is on List<T> (the co-called generic co/contra-variance), please keep in mind that some of us are also wanting the simpler co/contra-variance for sub-methods.  

    public class SubC : SuperC

    public class A

    public virtual SuperC Method1(SubC subc)

    public class B : A

    public override SubC Method1(object subc)

    Which, if I understand the concept correctly, does not require any additional syntax.  Certainly B has always been allowed to return a more restrictive subset of values.  Its just a matter of letting us express it so that the type-system is made aware of that fact.   E.g.

    B b = GetB();

    b.Method1().SomeMethodOnlyAvailableOnSubC();

    As for letting B handle wider parameters, not sure if that would require any extra syntax.  Seems like it wouldn’t, that it should just be allowed.  B has to meet the contract that A defines.  If B does anything above and beyond that, there is no harm, it has not violated A’s contract by doing more.

    I was hoping that VS2008 would have sub-method co/contra-variance, but its not there in beta 2.  Any chance that the final release will?  Or is that logic pretty much all tied in with the generic co/contra-variance (i.e. C# 4.0)?

  33. EricLippert says:

    There will be no features in C# in the final release that were not in the final beta. Adding features after final beta means shipping features that have never been beta tested, and we try very hard not to do that.

    In this series I explicitly did NOT discuss "override variance". I am well aware that a lot of people want this feature, and I may do another series on it in the future, but that’s not what I’ve been talking about here.

    Override variance is completely orthogonal to interface/delegate variance. They have nothing to do with each other (except insofar as interface variance might make more scenarios eligible for override variance.)

    And there is no such thing as C# 4.0.  Remember, this is all hypothetical discussion at this point. We have not announced any such product, so it is premature to be discussing specifics of its feature set!

  34. So nicely step by step blogged by Eric Lippert for &quot;Covariance and Contravariance&quot; as &quot;Fabulous