More on generic variance

In my entry on generic variance in the CLR, I said that you can’t convert a List<String> to a List<Object>, or even an IEnumerable<String> to IEnumerable<Object>.  I should point out however that the real-world scenarios where you’d want to do this usually involve passing an object of a more specific type to an API that (for abstraction reasons) takes a less specific type.  For example, say you have a class hierarchy like this:

 

    abstract class Shape

    {

        public abstract double ComputeArea();

    }

    class Square : Shape

    {

        public Square(double height)

        {

            m_height = height;

        }

        public override double ComputeArea()

        {

            return m_height * m_height;

        }

        private double m_height;

    }

    class Circle : Shape

    {

        public Circle(double radius)

        {

            m_radius = radius;

        }

        public override double ComputeArea()

        {

            return Math.PI * m_radius * m_radius;

        }

        private double m_radius;

    }

You’d like to be able to use it uniformly like this:

 

    class Program

    {

        static void Main(string[] args)

        {

            List<Square> ls = new List<Square>();

            ls.Add(new Square(2));

            ls.Add(new Square(3));

            double totS = GetTotalArea(ls); // won’t compile

            List<Circle> lc = new List<Circle>();

            lc.Add(new Circle(2));

            lc.Add(new Circle(3));

            double totC = GetTotalArea(lc); // won’t compile

        }

        public static double GetTotalArea( IEnumerable<Shape> l)

        {

            double total = 0;

            foreach (Shape s in l)

            {

                total += s.ComputeArea();

            }

            return total;

        }

    }

 

This would only work if C# supported generic variance and IEnumerable was defined as IEnumerable<+T>.  However, there is another option in this case.  You could make GetTotalArea a generic method and rely on a base-type constraint:

        public static double GetTotalArea<T>(IEnumerable<T> l)

            where T : Shape

        {

            double total = 0;

            foreach (Shape s in l)

            {

                total += s.ComputeArea();

            }

            return total;

        }

 

Now the above Main method will work perfectly, you don’t even have to modify it to specify the type parameters (C#’s type inference can figure them out automatically).  So although you're not really converting an IEnumerable<Square> to an IEnumerable<Shape>, you are able to use it that way.

 

This is certainly great.  And in practice, most .NET developers will find the support in .NET 2.0 for generics to be more than powerful enough.  However, if you’re trying to do some serious “generic programming” [3], or if you’re excited by programming language theory like I am, then this does still leave something to be desired.  Most notably, we can specify an upper-bound using a constraint like this (rather than require our language to support covariant type parameters), but the CLR and C# don’t have support for lower bounds (“supertype constraints”) so you can’t use this technique in place of contravariant type parameters (eg. my IComparer<-T> example).  See section 4.4 (“Comparison with Parametric Methods”) of [1] for more details.

 

Java 5 addresses these scenarios with “Wildcard types” [2].  Eg. you would use a “List<? extends Shape>” in Java to implement my example above, and there is also a syntax “? super T” for lower-bounds.  It’s interesting to note that the authors of the paper on wildcards indicate that the main advantage of wildcard types over using normal generic methods (as I’ve done above) is that wildcard types don’t require exact type information (see section 4.4 of [2]).  The big difference between generics in .NET and Java is that in .NET, the CLR supports generics to the core and so we have exact type information everywhere (which is why you can get information about generic types using Reflection at run-time).  In Java, generics are “erased” by the compiler, and so at run-time, the information about generic instantiations are lost.  Using erasure has the benefit of better compatibility with old code and a much simpler implementation, but extending the VM (like we did for the CLR) has several important benefits including avoiding boxing for instantiations at value types and therefore better performance (eg. a List<int> can be very efficient), and the ability to use exact type information at run-time.

 

The big difference between wildcards and generic variance in the CLR is that wildcard types are an example of “usage-site variance” where MSIL uses “definition-site variance” (meaning it’s the type definition that specifies the variance annotation, not the user of the type).  I was reading about a cool academic programming language called Scala recently, and was pleased to see that after some experience with usage-site variance, they decided to switch to definition-site variance because they found it easier to use correctly (see “Comparison with wildcards” in [4]).  Scala is a very cool language (my programming languages professor recently mentioned it as a great example of a language on the forefront of modern academic language design), and can target .NET.  Unfortunately they haven’t built support for the V2.0 CLR yet so they aren’t actually making use of the definition-site variance support in the CLR (for the moment <grin>).

 

Anyway, I think that’s about all I have to say about generic variance.  I’ve got a programming languages exam on Tuesday (I’m working on my masters in computer science) so I think I better stop procrastinating and study for it <grin>.

 

References

[1] On variance-based subtyping for parametric types

[2] Adding Wildcards to the Java Programming Language

[3] A Comparative Study of Language Support for Generic Programming

[4] An Overview of the Scala Programming Language (2. Edition)