Calling static methods on type parameters is illegal, part one


A developer passed along a question from a customer to our internal mailing list for C# questions the other day. Consider the following:

public class C { public static void M() { /*whatever*/ } }
public class D : C { public new static void M() { /*whatever*/ } }
public class E<T> where T : C { public static void N() { T.M(); } }

That’s illegal. We do not allow you to call a static method on a type parameter. The question is, why? We know that T must be a C, and C has a static method M, so why shouldn’t this be legal?

I hope you agree that in a sensibly designed language exactly one of the following statements has to be true:

1) This is illegal.
2) E<T>.N() calls C.M() no matter what T is.
3) E<C>.N() calls C.M() but E<D>.N() calls D.M().

(If there is a fourth possible sensible behaviour which I have missed, I am happy to consider its merits.)

If we pick (2) then this is both potentially misleading and totally pointless. The user will reasonably expect D.M() to be called if T is a D. Why else would they go to the trouble of saying T.M() instead of C.M() if they mean “always call C.M()“?

If we pick (3) then we have violated the core design principle of static methods, the principle that gives them their name. Static methods are called “static” because it can always be determined exactly, at compile time, what method will be called. That is, the method can be resolved solely by static analysis of the code.

That leaves (1).

Related questions come up frequently, in various forms. Usually people phrase it by asking me why C# does not support “virtual static” methods. I am always at a loss to understand what they could possibly mean, since “virtual” and “static” are opposites! “virtual” means “determine the method to be called based on run time type information”, and “static” means “determine the method to be called solely based on compile time static analysis”.

(Then again, people occasionally accuse me of “dogmatic skepticism”. Since “dogmatic” and “skeptical” are opposites, I am never sure quite what they mean either. My conclusion: people sometimes say strange things.)

Really what people want I think is yet another kind of method, which would be none of static, instance or virtual. We could come up with a kind of method which behaved like our option (3) above. That is, a method associated with a type (like a static), which does not take a non-nullable “this” argument (unlike an instance or virtual), but one where the method called would depend on the constructed type of T (unlike a static, which must be determinable at compile time).

I’m not sure whether the CLR generic system supports codegenning such a beast, but other than that, I don’t see any in-principle reason why such a thing would be difficult to do. But we do not add language features just because we can; we add them when the compelling benefit outweighs the costs. No one has yet made the case to me that this kind of method would be really useful.

Next time I’ll answer the follow-up question: why isn’t this determined at compile time? That will take us into the subtle differences between templates and generic types. They are often confused.

Comments (25)

  1. Angstrom says:

    Interestingly, some languages have "class methods" instead of "static methods" which have exactly property 3.  Smalltalk is the most notable language with this feature, but you can do it with CLOS and a few other class-and-object-oriented environments too.  It relies on the class itself also being an object upon which virtual dispatch can be performed, and you don’t get the static compile-time method resolution (since it’s just virtual dispatch on a different object), but it enables some powerful tricks, like eliminating the need for public constructors.

    I’ve always been sad that Java (my poison of choice) doesn’t have class methods.  Any chance we will ever see those in .Net?

  2. Stefan Wenig says:

    I can imagine that this might be useful not only for code generation, although I can’t think of an example right now. But if it’s only a rare case anyway, why not just call that method via reflection?

    public class E<T>

    where T : C

    {

     static s_M = typeof(T).GetMethod ("M", Public|Static, Type.EmptyArray);

     public static void N() { s_M(); }

    }

    (If your only problem were that the CLR can’t do that trick, you could have the compiler generate this code just as well, but it seems that’s not the case here.)

    The overhead from reflection is usually neglectible, as this is only called once. It’s of course weakly bound, but so is the original code you’ve been given. Since there is no explicit override, a single type paramter mismatch would create the much more subtle bug of silently calling the wrong method. Anyway, it’s probably good enough for something I might use once a year, and definately good enough for code generators.

    I think you’re right, there are more needed features for C# waiting to be justified against their cost (e.g, crowd pleasers like "infoof", complex attribute arguments, ordered and generic attributes, or boo-ish duck typing/DLR integration? How about those btw.? Ah well, just give us some AST meta programming model in the compiler pipe ;-) )

    May I ask something off-topic? What do you think of the rule that unlike Java, methods are not automatically virtual in C#?

    The answer is usually, because declaring a method virtual is part of a class’s contract, and means that you have to support a certain behavior in future versions. Agreed.

    But shouldn’t generated code be able to override non-virtual methods? Why shouldn’t I be able to create IL code (eg via Reflection.Emit) that does override them? It is allowed to access private members using Reflection because if you determine your target at runtime, your code is probably able to adapt to future changes automatically, so you don’t create a contract obligation. The same would be true for generating overrides of methods on the fly. This would make support for AOP-like features (using sth like subclass proxies) much better.

    This would of course have some important implications. But as more and more frameworks (including EntLib) are going in this direction, I think supporting this is becoming increasingly important. Would it be sensible to abandon the performance advantage of call instead of callvirt in all other cases? It might be preferrable to have this applied only to selected classes, and within C# and comparable languages, overrides should still be impossible. What do you think?

  3. harmony7 says:

    To add to Angstrom above, we can add our favorite language, JavaScript, to that list…

    How would JScript.NET handle such a situation anyways?

  4. Stefan Wenig says:

    angstrom, harmony – does it really make sense to point at features from dynlangs and ask, why is this not in c#? why not ask for completely name-based dynamic dispatching, or double/multi-dispatch too? i think it would be really interesting to make C# interoperate with the DLR (like in boo, where you can define variables of type "duck" and get duck typing on them, but on them only), but should this really influence the normal way in which C# resolves and calls methods? Or should we just accept that C# is static and use other languages if we want completely dynamic dispatching?

  5. Mike Dunn says:

    As a C++ programmer, I expect (3). I don’t see "static" and think "this has to be resolved at compile time." To me, "static" is a highly-overloaded keyword that, in this case, means "the method doesn’t have a this pointer." That’s all. The time where the call is resolved doesn’t enter into it. "static" applied to a class and "static" applied to a local variable have yet other meanings. Also, a static local variable isn’t initialized at compile time – the ctor doesn’t execute until runtime, or might not run at all, so the explanation of "static == compile time" breaks down there.

    Since T.M() is illegal, how do you write generic classes with a policy parameter? (for example, the Copy parameter to ATL::CComEnumImpl<>)

  6. Daniel says:

    I don’t seem to get the logic of allowing "new static" in a derived class but not allowing 3.). If overwriting of static methods using new is allowed, 3.) is the behaviour I would expect, not 1.). Otherwise it seems to be inconsistent to me. I mean, E<C> is a different type than E<D>, isn’t it?

  7. Brian says:

    What is the "compilation" we are talking about? is it running csc.exe or is it running the JIT? Does this make a difference? At the time of jitting, isn’t 3.) properly defined?

  8. Brian says:

    Can you explain to me why 3.) would violate the "because it can always be determined exactly, at compile time, what method will be called" principle? Why can’t that be determined exactly at the time of JIT-ing?

  9. patrick says:

    There are already some CLR languages around that support class methods, also virtual ones. I recall Delphi.NET and Chrome are two examples of such languages. These are both statically typed languages.

  10. Larry Lard says:

    >> Usually people phrase it by asking me why C# does not support “virtual static” methods. I am always at a loss to understand what they could possibly mean, since “virtual” and “static” are opposites!

    I think what’s going on here is that a lot of people (yes, I’m projecting here :)) *think of* static methods (and other members) as ‘class’ members as defined by commenters above. Indeed, the help for static says:

    >> Use the static modifier to declare a static member, which belongs to the type itself rather than to a specific object.

    Given this alone, from a didactic perspective it’s hard to see how developers new to C# are supposed to work out that ‘static’ is more than just a piece of historical syntactic cruft, but is actually *means something* – “determine the method to be called solely based on compile time *static* analysis”.

    The explanation in the post makes perfect sense given that static means what I now know it does. However, I now have to change my mental model, which is always interesting…

    ps: I’ve just searched through the C# spec for ‘static’. There’s a deal of stuff about ‘… belong to [the] class …’ and nothing at all about ‘must be determinable by compile time static analysis’. I can’t be along in my mental model confusion…

  11. Max says:

    I agree with Mike. This is especially true since we have some methods (non-virtual ones) that are resolved at compile time but are not decorated with the static keyword. Intuitively, "static" on class methods means "there is no this pointer", if it meant "resolved at compile time" we should really be adding this keyword to every non-virtual method. This means that "virtual static" really does make sense, and would have the behaviour of 3 above.

  12. Richard says:

    Your definition of ‘static’ seems very strange to me. If ‘static’ truly means ‘determine which function to call based on static analysis’ then why is it applied to a function definition rather than to a function call site? How to determine which function to call is really up to the caller (and even if in some cases it’s not, it should be, since they’re the ones who know what they want to do!) — even though usually they’ll say either ‘call a function with this name and signature’ or ‘call a function with this name and signature which is a member of the class corresponding to the dynamic type of this object’, it’s still the caller’s decision, and they could instead write a big nested if statement dispatching on the type of some objects.

    Also, your definition of ‘static’ does not imply ‘called without an object’ — it sounds very much like how I might define ‘final’ in Java, for instance.

    The meaning of ‘static’ in C++ class member functions is much more sensible in my opinion — there it only really means ‘this class member function can be called without a corresponding object’. Arguably, with this interpretation, ‘virtual static’ does make sense and would in practice occasionally be useful — it would mean ‘this class member function can be called without a corresponding object, but if it’s called with an object, the member which gets called is determined by the dynamic type of that object’. This is a pattern which I’ve seen used several times in real-world C++ when providing metadata for types (see for instance QObject::metaObject()).

  13. Gilles Michard says:

    You can do this:

       public class C { public static void M() { /*whatever*/ } }

       public class D : C { public new static void M() { /*whatever*/ } }

       public delegate void Action();

       public class E<T> where T : C { private static Action A = (Action)Delegate.CreateDelegate(typeof(Action), typeof(T).GetMethod("M")); public static void N { A();} }

  14. Aaron G says:

    I’m not completely sure I understand why it can’t be deduced at compile time, through static analysis only, which method will be called in option (3).  Isn’t T known at compile time, and doesn’t it logically follow that if T has a new method M, this would be known at compile time as well?

    Or is there maybe some subtlety in the Generics implementation that I’m missing here?

  15. Greg D says:

    I was going to leave a comment about my apparent confusion of class entities vs static entities, but a lot of folks have left more eloquent responses to that than I would have.  Instead I’ll just say that, from my own C++ background, I’d also have found (3) to be the intuitive resolution.  I’m looking forward to the followup post that will clarify the differences between generics and templates so that I can try to correct my mental model of the language feature.

  16. Bill S says:

    Aaron, Greg:

    I believe the subtlety you’re missing is that C# generics are instantiated at runtime by the CLR.  Only the actual generic class definition itself is written into the assembly, so you cannot implement (3) without violating this ‘static members must resolved during compilation’ rule.

    C++ templates are instantiated during compilation, and each instance is written to the resulting .o file.  This makes them much more powerful than C# generics, at the cost of bloating the object file.

    For what it’s worth, I agree with most posters in that my interpretation of ‘static’ in a class has always been ‘one per class’ rather than ‘must be resolved at compilation time’.  So I don’t see any issues at all with implementing (3) by resolving the method call when the generic is instantiated by the CLR, or even during runtime if necessary.

  17. Stefan Wenig says:

    Thinking about it, I agree with those who say that the definition of static as "must be resolved at compile time" is neither intuitive nor very useful.

  18. Rob Kennedy says:

    I expect option 2. That’s because I usually use Delphi, which has the "class methods" that Angstrom mentioned in the first comment. In Delphi, you can store a class reference in a variable and call methods on it without needing an instance of the class. Such methods still receive a "this" parameter ("Self" in Delphi), but it is a reference to the class, not to an instance of the class. We Delphi users typically see methods marked as "static" in other languages and translate them into class methods in Delphi. The differences aren’t usually important.

    Delphi class methods can be virtual because when calling them, the run-time value of the class-reference variable can vary. So, since C.M above is not declared virtual, any call to a method named "M" through a variable of type C must resolve to a call to C.M. The compiler doesn’t know anything about D.M at the time it’s compiling class E, so it certainly won’t generate a non-virtual call to D.M. But if C.M were virtual (and I know that in C# it can’t be), then class D could override that method, and then when E.N calls T.M, the run-time behavior would indeed depend on whether T was C or D.

    But that’s with my Delphi glasses on. I said the differences between static methods and Delphi class methods aren’t usually important, but this is a case where they are. Since static methods lack a "this" parameter, there is nothing on which to base a virtual call. I might expect option 2 instead, but with a warning diagnostic explaining that T.M always calls C.M. In that sense, option 1 and option 2 only differ by the severity of the diagnostic message.

  19. Stefan Wenig says:

    Brian: no, it’s not defined at JIT-time. the JIT produces exactly one implementation that is shared by all reference types, so this would only for work for value types. however, there seems to be some kind of this pointer for the type, since static methods in generic types do access their own static member fields. this could surely be used to implement a vtable for class methods – even by the compiler if the CLR won’t do it.

    I think there are two possible ways to go ahead. If we think class methods are not so important, we can store delegates in the calling site like I sketched out in the slightly broken code snippet at the top (but you get the idea…)

    If they are important, it shoud be possible to use virtual/override for static methods, and a vtable lookup should be implemented.

    It doesn’t make sense to use overloading for the code Eric postet. Implicit overriding is just to weak, so we’d be better off with using reflection manually – at least that way it’s clear how weak it is.

    I would be more interested in the usage scenarios for class methods though. Like Eric said, any new feature should be evaluated for cost and usefulnes. However, I still believe that methods declared with the "static" keyword must be resolvable at compile time is neither intuitive nor useful.

  20. Welcome to the XXVIII Community Convergence. In these posts I try to wrap up events that have occurred

  21. John Rusk says:

    Stefan,

    You can use this technique to make your example early bound and compiler checked:  http://dotnet.agilekiwi.com/blog/2007/04/symbols-part-2.html

  22. Stefan Wenig says:

    John, this technique is useful for getting a MethodInfo of a method that the compiler would let you call anyway. All restrictions (like visibility and the one we’re discussing here) apply. So no, you can’t get a delegate for T.M for the same reason that you cannot call T.M. Even if you could, it would ruin the effect, since early binding is exactly what we do not want here. Since there are no static virtual methods, and overriding a static method using the "new" keyword, the same name and the same signature is an arbitrary concept of our code (not known to C# or the CLR), there is no way around dealing with strings.

  23. There were lots of good comments on my previous entries in this series. I want to address some of them,

  24. DiegoV says:

    I have been crushed against the lack of “non-static class members” in .NET a few times. Let me pretend that I can make the case for such a creature to exist. If today wasn’t Sunday I could perhaps produce a good example based on a class hierarchy of my own. But let me try something easier…

    In the BCL there are multiple types that define static Parse and, from 2.0 and up, TryParse methods. The basic prototype for Parse is:

    public static T Parse(string s)

    And for TryParse is:

    public static bool TryParse(string s, out T result)

    My problem with those has always been that they are static (have to be resolved at compile time) and hence they cannot be treated with polymorphism. Also, the way things are in .NET, there is no way to define an IParseable or IParseable<T> interface for types that define those methods. Hence, there is no way you can build generic code over Parse or TryParse without using reflection. Also, IMHO, the way the BCL is build around this shortcoming is somewhat graceless (take for instance, the internal class Number).

    By the way, Visual Basic .NET define non-instance class members with the keyword “shared”, which has absolutely no “have to be resolved at compile time” connotation.

  25. Last time I pointed out that static methods are always determined exactly at compile time, and used that