Why can’t I declare a type that derives from a generic type parameter?


A lot of questions about C# generics come from the starting point that they are just a cutesy C# name for C++ templates. But while the two may look similar in the source code, they are actually quite different.

C++ templates are macros on steroids. No code gets generated when a template is "compiled"; the compiler merely hangs onto the source code, and when you finally instantiate it, the actual type is inserted and code generation takes place.

// C++ template
template<class T>
class Abel
{
public:
 int DoBloober(T t, int i) { return t.Bloober(i); }
};

This is a perfectly legal (if strange) C++ template class. But when the compiler encounters this template, there are a whole bunch of things left unknown. What is the return type of T::Bloober? Can it be converted to an int? Is T::Bloober a static method? An instance method? A virtual instance method? A method on a virtual base class? What is the calling convention? Does T::Bloober take an int argument? Or maybe it's a double? Even stranger, it might accept a Canoe which gets converted from an int by a converting constructor. Or maybe it's a function that takes two parameters, but the second parameter has a default value.

Nobody knows the answers to these questions, not even the compiler. It's only when you decide to instantiate the template

Abel<Baker> abel;

that these burning questions can be answered, overloaded operators can be resolved, conversion operators can be hunted down, parameters can get pushed on the stack in the correct order, and the correct type of call instruction can be generated.

In fact, the compiler doesn't even care whether or not Baker has a Bloober method, as long as you never call Abel<Baker>::DoBloober!

void f()
{
 Abel<int> a; // no error!
}

void g()
{
 Abel<int> a;
 a.DoBloober(0, 1); // error here
}

Only if you actually call the method does the compiler start looking for how it can generate code for the DoBloober method.

C# generics aren't like that.

Unlike C++, where a non-instantiated template exists only in the imaginary world of potential code that could exist but doesn't, a C# generic results in code being generated, but with placeholders where the type parameter should be inserted.

This is why you can use generics implemented in another assembly, even without the source code to that generic. This is why a generic can be recompiled without having to recompile all the assemblies that use that generic. The code for the generic is generated when the generic is compiled. By comparison no code is generated for C++ templates until the template is instantiated.

What this means for C# generics is that if you want to do something with your type parameter, it has to be something that the compiler can figure out how to do without knowing what T is. Let's look at the example that generated today's question.

class Foo<T>
{
 class Bar : T
 { ... }
}

This is flagged as an error by the compiler:

error CS0689: Cannot derive from 'T' because it is a type parameter

Deriving from a generic type parameter is explicitly forbidden by 25.1.1 of the C# language specification. Consider:

class Foo<T>
{
 class Bar : T
 {
   public void FlubberMe()
   {
     Flubber(0);
   }
 }
}

The compiler doesn't have enough information to generate the IL for the FlubberMe method. One possibility is

ldarg.0        // "this"
ldc.i4.0    // integer 0 - is this right?
call T.Flubber // is this the right type of call?

The line ldc.i4.0 is a guess. If the method T.Flubber were actually void Flubber(long l), then the line would have to be ldc.i4.0; conv.i8 to load an 8-byte integer onto the stack instead of a 4-byte integer. Or perhaps it's void Flubber(object o), in which case the zero needs to be boxed.

And what about that call instruction? Should it be a call or callvirt?

And what if the method returned a value, say, string Flubber(int i)? Now the compiler also has to generate code to discard the return value from the top of the stack.

Since the source code for a generic is not included in the assembly, all these questions have to be answered at the time the generic is compiled. Besides, you can write a generic in Managed C++ and use it from VB.NET. Even saving the source code won't be much help if the generic was implemented in a language you don't have the compiler for!

Comments (18)
  1. Interesting choice of metasyntactic variables.  I can’t say I’ve ever seen the first monkeys successfully sent into space used in code before.  Although I believe Able’s name was spelled Able, not Abel.

  2. Erzengel says:

    This is another one of those "Someone asked this?" But I suppose it makes sense that someone who doesn’t have an understanding of generics might wonder about this.

    For one thing, with generics, if I don’t restrict the generic to a particular interface, I can’t do anything to the generic type at all. I can’t use T.DoFunc because the compiler doesn’t know what DoFunc is, as said in this article. But if I restrict T to an interface (or more accurately, a derivation of the interface), then I can call those functions.

    That makes me wonder in the context of the article, what if the type parameter were restricted to an interface (ie, "where T: class, IDictionary")? You’d only be able to use functions that your base type overloads from the interface, and you obviously couldn’t use sealed classes.

    But the problem really isn’t with calling functions from the base class. The problem is from DERIVING from the base class. Code needs to be generated for the generic class. It would be very difficult, I think, to create a class that needs to have it’s base class changed every time the generic is instantiated. Is it even possible?

  3. zahical says:

    "imaginary world of potential code"

    Beautifully said, Mr. Chen!

  4. Anonymous says:

    I don’t code in C#, but it seems pretty useful to be able to say, “instantiation of this generic requires you specify a type that implements this interface.”  That way the generic could call Flubber() and the compiler can still check at compile-time whether you’re using something that conforms to IFlubberable.  Any C# types reading this know if this is possible?

    [You can always read about C# generics yourself and find out. It’s quite readable, really. -Raymond]
  5. Luke says:

    @Adam [waaay off-topic]: "Able" and "Baker" are the first two "letters" of the old US military Phonetic Alphabet:

    http://en.wikipedia.org/wiki/Joint_Army/Navy_Phonetic_Alphabet

    which was developed decades before the space race. The monkeys’ names are therefore a little bit of aerospace geek humor; a more humane way of naming them "monkey A" and "monkey B". The phonetic alphabet is also the reason that people (not just Mr. Chen) use Able and Baker as metasyntactic variables. I imagine "able" often gets changed to "abel" because then they both sound like English names. (like Cain and Abel)

  6. This is why I like writing .NET code with C++/CLI. I can use both templates and generics, which opens up some interesting possibilities.

    That, and I don’t have to fool around with P/Invoke.

  7. Kevin says:

    ldarg.1        // "this"

    This should actually be ldarg.0

    [Fixed, thanks. -Raymond]
  8. Mihailik says:

    Well, you’re more careful, but still referring to IL is quite a miss.

    Deriving from T by no means implies calling a non-existent method.

    It’s the same as saying ‘4-storey buses don’t exist because they would have had square wheels’. There’s just not connexion between those 2 things.

    Another incorrect IL-influenced assumption. Call/callvirt dilemma is an everyday thing in generics. Fixating on IL, how would you invoke ToString method for a variable of type T? Surely for struct and class that’s a bit different :-)

    The correct reason why they prohibit 25.1.1 is much more complex machine code generation required for that case. Currently, generic class is quite certain in what it is derived from, so in most of cases actual machine code may be generated only once for all T types (in case of non-value-types anyway). With parametrised inheritance each T would require its own machine code generation.

    Still, possible but very annoying.

    Additional grief would be caused by abstract methods and non-default constructors. Though that could be simple outlawed, so that’s not a blocker.

  9. Random832 says:

    "how would you invoke ToString method for a variable of type T?" – ildasm says callvirt   instance string [mscorlib]System.Object::ToString()

    If I actually limit the generic to value types [Class x(Of t as Structure) – don’t know the C# syntax offhand], it calls ValueType::ToString

    ValueType is a subclass of Object, even though it’s clearly a special case.

  10. Random832 says:

    Of course, (I thought of this case after posting) that doesn’t explain the case where it picks up the right version of the function if it Shadows (i.e. "new" function in C#) and therefore doesn’t appear in the ToString vtable slot from Object.

  11. Jason says:

    Anonymous:

    Yes, you specify:

    class Foo<T> where T : IFlubberable

    {

    }

    .. However you still can’t derive an inner class from T since Raymond’s assertions still hold. You can, of course, derive from IFlubberable.

  12. iliyat says:

    > In fact, the compiler doesn’t even care whether or not Baker has a Bloober method, as long as you never call Abel<Baker>::DoBloober

    The Microsoft compiler doesn’t care. gcc would compile the entire class at the instantiation point and fail because of the missing Bloober method. As far as I remember gcc does the right (standard conformant) thing here

    [Section 14.7.1 (Implicit instantiation) paragraph 1 says “… The implicit intantiation of a class template specialization causes the implicit instantiation of the declarations, but not of the definitions or default arguments, of the class member functions… [T]he specialization of the member is implicitly instantiated when the specialization is referenced in a context that requires the member definition to exist.” (Emphasis mine.) Therefore (according to my reading), Abel<int> a; causes the declaration int Abel<int>::DoBloober(int t, int i) to exist, but the function body does not exist because it has not been referenced in a context that requires its definition to exist. Since the body does not exist, there is no error yet. I may be reading it wrong, however. -Raymond]
  13. anonymous says:

    iliyat: No. gcc doesn’t compile the whole template code at instantiation.

  14. steveg says:

    @Paul M. Parks: This is why I like writing .NET code with C++/CLI. I can use both templates and generics, which opens up some interesting possibilities.

    I don’t know whether to be terrified or amazed. No maintenance issues?

  15. Michiel says:

    It’s a bit remarkable to say that C++ doesn’t generate code for templates, but C# does for generics. Both compile to an intermediate form.

    Now, the intermediate form for C# is IL, which is (A) standardized and (B) closer to x86 code, but in either case the conversion from text to runnable code takes two phases. The example C++ code merely shows that C++ has different phases.

    A good example would be { return t..Bloober(i); } which fails to compile, even before providing T t. There’s simply no way it could be correct.

    [The “intermediate form” generated by C# is in fact the final form generated by the compiler. (Besides, you can argue that x86 code still isn’t the “final form” since it has to be “compiled” by the CPU into smaller units for execution.) C#’s “final form” is consumable by other languages; C++’s isn’t. C++’s “processed template form” is not visible to other compilation units. C#’s is. -Raymond]
  16. Mike Dimmick says:

    iliyat, anonymous: The C++ standard says to bind names that do not depend on the template parameters when the template is encountered. It’s kind of a partial compilation. Microsoft’s compiler does not currently do this – it just captures the source code and uses it as a macro. You can get errors or differing behaviour when moving code between Visual C++ and a conforming compiler.

    Microsoft document this nonstandard behaviour at http://msdn.microsoft.com/en-us/library/w98s4hs8.aspx. Unfortunately, still not fixed in Visual Studio 10.

  17. Petr says:

    The example does not explain the problem of deriving from a generic type parameter. Sure, you cannot call this.Flubber(0) in the example, but it is not specific to this case. You cannot do this, either:

    class Foo<T>

    {

    class Bar

    {

      public void FlubberMe(T inst)

      {

        inst.Flubber(0);

      }

    }

    }

    You could fix that with a constraint, but no constraint allows you to derive from T. Why? The article does not explain this.

    [Good point. I answer the question with a puzzle. What does this program print? What are its consequences for deriving from a template class?

    class TBase {
      public void Flubber(int i) { }
    }

    class Foo<T> where T : TBase {
     public void DoFlubber(T t) { t.Flubber(0); }
    }
    class Baker : TBase {
     new public void Flubber(int i)
        { System.Console.WriteLine(“yo”); }
    };
    class Program {
        public static void Main()
        {
          var f = new Foo<Baker>();
          f.DoFlubber(b);
        }
    }

    -Raymond
    ]
  18. Miral says:

    Armchair analysis of that code suggests it should print nothing.  In Foo<T>, T is known to be a TBase, and TBase’s Flubber method prints nothing.  The fact that T is a Baker and Baker introduces a new method called Flubber is irrelevant, since it’s using a new slot.  (The old method still exists, you just can’t get to it without a base cast, which is effectively what the generic does.)

    Had the Flubber method been virtual (and Baker had overridden it), then it would have printed something.

    In general (other than being able to use new sometimes) generics behave no differently than standard methods using the assumed base, except that generics don’t need the typecasts.

    I’m not denying that generics are a very useful feature and well worth having, but sometimes I miss templates.  (A recent example was when I was trying to invoke TryParse on either of Int32 or Double depending on the type of a parameter.  Not hard to do with overloads instead, but it gets tedious, particularly if you want to handle any of the primitive types.  It also made me wish there was such a thing as a “static interface”.)

    [Okay, now what does this say about deriving from T? What does Foo<Baker>.Bar.FlubberMe() call? What does this mean for code generation? -Raymond]

Comments are closed.