I’ve written a lot about this already, but I think one particular point bears repeating.

As we’re getting closer to shipping C# 4.0, I’m seeing a lot of documents, blogs, and so on, attempting to explain what “covariant” means. This is a tricky word to define in a way that is actually meaningful to people who haven’t already got degrees in category theory, but it can be done. And I think it’s important to avoid defining a word to mean something other than its actual meaning.

A number of those documents have led with something like:

“Covariance is the ability to assign an expression of a more specific type to a variable of a less specific type. For example, consider a method M that returns a Giraffe. You can assign the result of M to a variable of type Animal, because Animal is a less specific type that is compatible with Giraffe. Methods in C# are ‘covariant’ in their return types, which is why when you create a covariant interface, it is indicated with the keyword ‘out’ — the returned value comes ‘out’ of the method.”

But that’s **not at all** what covariance means. That’s describing “assignment compatibility” — the ability to assign a value of a more specific type to a storage of a compatible, less specific type is called “assignment compatibility” because the two types are compatible for the purposes of verifying legality of assignments.

So what does covariance mean then?

First off, we need to work out precisely what the adjective “covariant” applies to. I’m going to get more formal for a bit here, but try to keep it understandable.

Let’s start by not even considering types. Let’s think about integers. (And here I am speaking of actual mathematical integers, not of the weird behaviour of 32-bit integers in unchecked contexts.) Specifically, we’re going to think about the ≤ relation on integers, the “less than or equal to” relation. (Recall that of course a “relation” is a function which takes two things and returns a bool which indicates whether the given relationship holds or does not hold.)

Now let’s think about a projection on integers. What is a projection? A projection is a function which takes a single integer and returns a new integer. So, for example, z → z + z is a projection; call it D for “double”. So are z → 0 – z, N for “negate” and z → z * z, S for “square”.

Now, here’s an interesting question. Is it always the case that (x ≤ y) = (D(x) ≤ D(y))? Yes, it is. If x is less than y, then twice x is less than twice y. If x is equal to y then twice x is equal to twice y. And if x is greater than y, then twice x is greater than twice y. The projection D preserves the direction of size.

What about N? Is it always the case that (x ≤ y) = (N(x) ≤ N(y))? Clearly not. 1 ≤ 2 is true, but -1 ≤ -2 is false. But we notice that the reverse is always true! (x ≤ y) = (N(y) ≤ N(x)). The projection N reverses the direction of size.

What about S? Is it always the case that (x ≤ y) = (S(x) ≤ S(y))? No. -1 ≤ 0 is true, but S(-1) ≤ S(0) is false. What about the opposite? Is it always the case that (x ≤ y) = (S(y) ≤ S(x)) ? Again, no. 1 ≤ 2 is true, but S(2) ≤ S(1) is false. The projection S does not preserve the direction of size, and nor does it reverse it.

The projection D is “covariant” — it preserves the ordering relationship on integers. The projection N is “contravariant”. It reverses the ordering relationship on integers. The projection S does neither; it is “invariant”.

Now I hope it is more clear exactly what is covariant or contravariant. The integers themselves are not variant, and the “less than” relationship is not variant. It’s **the projection** that is covariant or contravariant — the rule for taking an old integer and making a new one out of it.

So now lets abandon integers and think about reference types. Instead of the ≤ relation on integers, we have the ≤ relation on reference types. A reference type X is smaller than (or equal to) a reference type Y if a value of type X can be stored in a variable of type Y. That is, if X is “assignment compatible” with Y.

Now consider a projection from types to types. Say, the projection “T goes to IEnumerable<T>”. That is, we have a projection that takes a type, say, Giraffe, and gives you back a new type, IEnumerable<Giraffe>. Is that projection covariant in C# 4? Yes. It preserves the direction of ordering. A Giraffe may be assigned to a variable of type Animal, and therefore an sequence of Giraffes may be assigned to a variable that can hold a sequence of Animals.

We can think of generic types as “blueprints” that produce constructed types. Let’s take the projection that takes a type T and produces IEnumerable<T> and simply call that projection “IEnumerable<T>”. We can understand from context when we say “IEnumerable<T> is covariant” what we mean is “the projection which takes a reference type T and produces a reference type IEnumerable<T> is a covariant projection”. And since IEnumerable<T> only has one type parameter, it is clear from the context that we mean that the parameter to the projection is T. After all, it is a lot shorter to say “IEnumerable<T> is covariant” than that other mouthful.

So now we can define covariance, contravariance and invariance. A generic type I<T> is **covariant** (in T) if construction with reference type arguments **preserves** the direction of assignment compatibility. It is **contravariant** (in T) if it **reverses** the direction of assignment compatibility. And it is **invariant** if it does neither. And by that, we simply are saying in a concise way that the **projection** which takes a T and produces I<T> is a covariant/contravariant/invariant projection.

UPDATE: My close personal friend (and fellow computer geek) Jen notes that in the Twilight series of novels, the so-called “werewolves” (who are not transformed by the full moon and therefore not actually werewolves) maintain their rigid social ordering in both wolf and human form; the projection from human to wolf is covariant in the social-ordering relation. She also notes that in high school, programming language geeks are at the bottom of the social order, but the projection to adulthood catapults them to the top of the social order, and therefore, growing up is contravariant. I am somewhat skeptical of the latter claim; the former, I’ll take your word for it. I suppose the question of how social order works amongst teenage werewolves who are computer geeks is a subject for additional research. Thanks Jen!

Great explanation. A small suggestion: when giving the integers example at first, you write

"It’s the projection that is covariant or contravariant." You might want to add "with respect to the <= ordering" just to make that explicit.

Thank you, thank you, thank you! I finally understand the concept – this is the first explanation I have read that *makes sense*!

Thanks a lot Eric, for such neat explanation. I was always confused about these until now.

Nicely done. Wikipedia has an article on it here: http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)

On Jen’s note, in Teen Wolf, the wolf catapulted Scott Howard up the social-ordering relation, transforming from a relative nobody into the most popular kid in school (and an excellent basketball player). In this case, would the projection from human to wolf be contravariant? If so, would that render the overall projection invariant, since Twilight apparently goes against those rules?

That was the best explanation of those two terms that I have yet com across. Thanks!

This is another excellently communicated post. If you are looking for more information on covariance/contravariance, I found this link helpful:

http://research.microsoft.com/apps/pubs/default.aspx?id=64042

It is more abstract than this post, but it helps to explain why the in/out keywords were selected, although they use + and – in the paper.

That is an amazingly un-intuitive use of the word “invariant”. “Invariant” means “unchanging” in every other context, but an “invariant generic type” is apparently one which changes subtyping relationships (in a way other than simply reversing them)! That would break my brain if I had to deal with that term on a regular basis.

(For the record, vectors and tensors can have “mixed variance”, and I’m not sure if there’s any term at all in category theory for something which is neither covariant or contravariant; category theory applies those terms to “functors” and a functor has to be either covariant or contravariant by definition.)

It’s only going to get worse. On Thursday I’m going to give a deeply counter-intuitive definition of “invariantly valid”.

A small saving grace is that an invariant generic type simply destroys subtyping relationships. A List<X> is in no way related to List<Y> no matter how (different) X and Y are related. — Eric

I am with @Erik – couldn’t have said it better. Thanks a lot!

As to the teenage werewolves who are computer geeks, if we assume that the New Age Techno-werewolfs are transformed not by the full moon, but by the prospect of better salary rate and/or career advance, then "growing up" and "transition from human to wolf" are pretty much the same ptojection. So one of them cannot be covarian, and the other contravariant! ðŸ˜€

Thanks for (at last) explaining co and contravariance without mentioning arrays and programming language consturucts. I didn’t know the roots of this terms and have troubles with differentiating them when reading in pure programming context (especially when comparing Java to .NET).

Great analogy, I’ll definitely use it sometime..

I have always wondered why those particular words were used to describe that principle (covariance, contravariance, invariance). Does anyone have an idea on their etymology?

If everybody who calls themselves programmers had picked up a copy of Object-Oriented Software Construction (2nd ed) and actually read it, this wouldn’t be an issue

As an "old" Eiffel programmer it is nice to see that most of Eiffel’s main ideas are slowly making their way into the world of curly braces ðŸ˜‰

Btw, nice explanation, Eric.

"A generic type I<T> is covariant (in T)"

It would be nice if you mentioned in footnotes that I<T1, T2, T3> can be covariant (in T1), contravariant (in T2) and invariant (in T3) at the same time, just for completeness.

Thanks. 2 weeks ago I spend 6 hours to get a similar understanding. Next time I will wait for your blog article ðŸ˜‰

Now I understand and can even explain to someone else )

> As an “old” Eiffel programmer it is nice to see that most of Eiffel’s main ideas are slowly making their way into the world of curly braces

IIRC, Eiffel generic types were _always_ covariant on all type parameters, with no possibility to opt-out. I remember been exceedingly surprised when it let me write:

local

xs : ARRAYED_LIST[INTEGER]

ys : ARRAYED_LIST[NUMBER]

do

create xs.make(1)

ys := xs — not an error due to universal covariance

ys.extend(123.456) — great, we’ve just added a REAL to a list of INTEGERs

put_integer(xs.i_th(1)) — prints out garbage

end

It was really scary to see how a supposedly high-level language was not only not typesafe, but not even memory safe.

Now granted, this was a few years ago, and I’ve heard that this kind of things was fixed (or at least being fixed) in 6.x… still, I’m glad that C# rather went for no generic covariance at start, and then added it while doing it right, rather than going for the broken model from the get go, and then trying to fix it in a backwards-compatible way.

Eiffel is a bit of an odd duck in many respects. For example, it allows not just covariance on return types of virtual method overrides (which is typesafe) but also allows covariance on parameter types! Contravariance on parameter types would be type safe, but covariance certainly is not. — Eric

The penny has finally dropped. I have now reread the entire series of posts in a new light.

Thanks.

@Eirik M: OOSC2 is huge, but well worth the read. I too fervently hope that C# is at least occasionally glancing at the Eiffel feature set, although I think the MI-train has unfortunately long since left the station. With the introduction of covariance, though, there’s still hope for anchored types!

I’m programming in C# 2.0 and I’ve run into the problem of no-variant (or "invariant", as Eric calls them) generics. Having done a lot of Eiffel programming, I was dismayed to find that C# 2.0 generics lack this essential feature.

I found this excellent page while googling for a solution. While I’m really pleased to learn that C# 4 will fix this defect, it doesn’t help me right now. How do C# programmers normally work around the lack of covariant generics?

An amazing explanation. And , I think, Gabriel has a valid point here. The "with respect to <= ordering" would have further clarified it.

Just wanted to say "Thank you so much for this explanation".

Thank you so much! As for generic type parameters this seems really clear.

But how does “Java method return types are covariant” fit in here? How is that a projection? Or is this an invalid statement?

Could one say, that a implicit cast behaves covariantly? And that C# method return types and arguments are invariant?

Your question is essentially “in what way is

return type covariancean example ofcovariance?” Let’s take a step back and re-state our definition of covariance. Suppose you have a category S. Suppose you have a Boolean relation on members of that category, which we will write x~y. And suppose you have an function F:S->S — that is, a function from a category onto itself. F is said to be covariant if x~y implies that F(x)~F(y). In the kind of covariance I’ve been talking about so far, S is the category of reference types, “~” means “is assignment compatible”, and F(x)–>IEnumerable(x). (And in fact, F meets the definition of an endofunctor: F is a covariant function from a category onto itself which preserves morphisms under composition.)Now let’s make S a bit broader. Let’s say that the notion of “a virtual function that takes no parameters and returns R” is also a member of S, for all possible return types. We’ll notate these guys as V<R>. This is not a type per se, but it is a member of S. Let’s extend our definition of x~y to mean “if x and y are both types then x~y means that x is assignment compatible with y, and if x and y are both of the form V<RX> and V<RY> then x~y means that V<RX> can be a legal virtual override of V<RY>”.

Now let’s make a function G:S–>S such that G(x)–>V<x> if x is a type, and G(x)–>x if x is a virtual function signature. G meets our definition of a covariant function with respect to ~ in the Java type system: that is, if x~y then G(x)~G(y).

We could then extend this argument to virtual functions with arguments, and so on. But I hope you see how this goes. Essentially “covariant” means “some transformation preserves some aspect of compatibility”. In the kind of covariance I’m talking about, the kind of compatibility is assignment compatibility, but it can be some broader notion of compatibility, including virtual method substitution. — Eric

Got that! Now on to casts.

Is “implicit cast” an example of (safe) covariance? And can a “explicit cast” be one of (unsafe) contravariance?

In the German Wikipedia it says, that “Co- and Contravariances” describes if a aspect in an object-oriented system behaves along the lines of inheritance or against it.

In that definition a implicit cast would behave (safely) along, and a explicit cast (unsafely) against.

Is that nonsense? (I feel it could be :-))

If it is right, his statements could help people to understand co- and contravariance better – without explaining it too detailed.

I’m not following you here. Go back to the definition I’ve given: to figure out if something is “covariant” then (1) state a relation ~:(S, S)–>bool (2) state a mapping F:S–>S and (3) determine whether x~y implies that F(x)~F(y). In your example, what do you propose is the relation and what do you propose is the mapping? — Eric