Generic type parameter variance in the CLR

When people start using C# generics for the first time, they are sometimes surprised that they can’t convert between related generic instances. For example, since you can convert a string to an object, shouldn’t you also be able to convert a List<string> to a List<object>? After all, you can convert a string[] to an object[], why should List<T> be any different?

More formally, in C# v2.0 if T is a subtype of U, then T[] is a subtype of U[], but G<T> is not a subtype of G<U> (where G is any generic type). In type-theory terminology, we describe this behavior by saying that C# array types are “covariant” and generic types are “invariant”.

There is actually a reason why you might consider generic type invariance to be a good thing. Consider the following code:

List<string> ls = new List<string>();

      ls.Add("test");

      List<object> lo = ls; // Can't do this in C#

      object o1 = lo[0]; // ok – converting string to object

      lo[0] = new object(); // ERROR – can’t convert object to string

If this were allowed, the last line would have to result in a run-time type-check (to preserve type safety), which could throw an exception (eg. InvalidCastException). This wouldn’t be the end of the world, but it would be unfortunate. Part of the benefit of generic types is that it helps to avoid the run-time checking and error handling overhead involved in converting back and forth from a base type (object in C#). It simplifies things greatly (both from a developer and run-time perspective) to be able to say that a variable of type List<object> can for sure hold any object.

So what about arrays, don’t we have the exact problem there? Yep. This was a hotly debated aspect to C#/CLR design. Java also has covariant array types, and whether or not this is a language flaw has been debated for ages. Jim Miller has an interesting annotation in his CLI book (pg 59) where he says "The decision to support covariant arrays was primarily to allow Java to run on the VES. The covariant design is not thought to be the best design in general, but it was chosen in the interest of broad reach". [Update: I've heard that Bill Joy, one of the original Java designers, has since said that he tried to remove array covariance in 1995 but wasn't able to do it in time, and has regretted having it in Java ever since]

Let’s get back to our original question. What if you really did want covariant generic types in a .NET language? Does the CLR really prevent that? Well, you may recall that I alluded to the fact that Eiffel makes use of additional features in the CLR generic support, that C# and VB do not. In Eiffel, generic types are always covariant (and so have the same sort of run-time checks as arrays). If you were to look at the draft v2 ECMA specification, you would see the following:

In addition, CLI supports covariant and contravariant generic parameters, with the following characteristics:

· It is type-safe (based on purely static checking)

· Simplicity: in particular, variance is only permitted on generic interfaces and generic delegates (not classes or value-types)

· Languages not wishing to support variance can ignore the feature, and treat all generic types as non-variant.

· Enable implementation of more complex covariance scheme as used in some languages, e.g. Eiffel.

Contravariance is the opposite of covariance – it means that the subtype relationship of the generic type varies inversely with the relationship of the type parameter. Or formally, G<T> is a subtype of G<U> if and only if U is a subtype of T. This is a pretty cool approach in my opinion. It allows a compiler to mark generic type parameters on an interface as being nonvariant, covariant or contravariant, and then enforces that those types are used only in places where they are safe (i.e. don’t require a run-time type check). Specifically, it’s always safe to use a covariant type as an output (return type or out parameter), and a contravariant type as an input. In IL, covariant type parameters are indicated by a ‘+’, and contravariant type parameters are indicated by a ‘-‘ (non-variant type parameters are the default, and can be used anywhere). Consider this simple example (using a theoretical extension of C#) from the draft ECMA spec [Update: switched to the newly announced C# 4.0 syntax]:

// Covariant parameters can be used as result types

interface IEnumerator<out T> {

      T Current { get; }

      bool MoveNext();

}

// Covariant parameters can be used in covariant result types

interface IEnumerable<out T> {

      IEnumerator<T> GetEnumerator();

}

// Contravariant parameters can be used as argument types

interface IComparer<in T> {

      bool Compare(T x, T y);

}

This would mean we could write code like the following:

      IEnumerable<string> stringCollection = ...;

      IEnumerable<object> objectCollection = stringCollection;

      foreach( object o in objectCollection ) { ... }

      IComparer<object> objectComparer = ...;

      IComparer<string> stringComparer = objectComparer;

      bool b = stringComparer.Compare( "x", "y" );

But we couldn’t do the opposite – try and convert an IEnumerable<object> to an IEnumerable<string>, or try and convert an IComparer<string> to an IComparer<object>. In those cases, we’d get a compile time error telling us such a conversion wouldn’t be type-safe. Of course, languages (like Eiffel) are free to build their own run-time type checking on top of the strict CLR support to relax the rules where they see fit.

In my opinion this sort of language feature would occasionally be very useful (although the real scenarios would, of course, be much more involved than the above simplistic examples). There have been a couple of times when writing real-world C# code when I knew I could write simpler code if the CLR generic variance support was exposed in C#. Generic variance support in C# has been discussed on the MSDN product feedback center, and I don’t think the C# team has ruled it out for a future addition to the language (but of course I don’t know their plans any better than you do). Although I think this is a cool feature, I don’t think I’m going to start writing any applications in IL just so I can take advantage of it. However, it would probably be a fun project to write or extend and existing free C# compiler (like Mike Stall’s Blue) with support for generic variance. If you're interested in more details (including a thorough description of assignment compatibility in the face of generic variance) see the draft v2 ECMA specification.

 

[Update: Added details about covariant arrays from Jim Miller's CLI book, and pointer to the draft v2 ECMA spec above.]

[Update: See my post "More on generic variance" for more details and discussion]

[Update: Here's an MSDN article that discusses this as well: https://msdn2.microsoft.com/en-us/library/ms228359(vs.80).aspx]

[Update: Anders has finally announced that C# 4.0 and the BCL will support generic variance with "in" and "out" keywords. This has been in the works for awhile, and of course I'm super excited about it.]