Calling constructors in arbitrary places


C# lets you call another constructor from a given constructor, but only before the body of the calling constructor runs:

public C(int x) : this(x, null)
{
  // …
}
public C(int x, string y)
{
  // …
}

Why can you call another constructor at the beginning of a constructor block, but not at the end of the block, or in the middle of the block?

Well, let’s break it down into two cases. (1) You’re calling a “base” constructor, and (2) you’re calling a “this” constructor.

For the “base” scenario, it’s quite straightforward. You almost never want to call a base constructor after a derived constructor. That’s an inversion of the normal dependency rules. Derived code should be able to depend on the base constructor having set up the “base” state of the object; the base constructor should never depend on the derived constructor having set up the derived state.

Suppose you’re in the second case. The typical usage pattern for this scenario is to have a bunch of constructors that take different arguments and then all “feed” into one master constructor (often private) that does all the real work. Typically the public constructors have no bodies of their own, so there’s no difference between calling the other constructor “before” or “after” the empty block.

Suppose you’re in the second case and you are doing work in each constructor, and you want to call other constructors at some point other than the start of the current constructor.

In that scenario you can easily accomplish this by extracting the work done by the different constructors into methods, and then calling the methods in the constructors in whatever order you like. That is superior to inventing a syntax that allows you to call other constructors at arbitrary locations. There are a number of design principles that support this decision. Two are:

1) Having two ways to do the same thing creates confusion; it adds mental cost. We often have two ways of doing the same thing in C#, but in those situations we want the situation to “pay for itself” by having the two different ways of doing the thing each be compelling, interesting and powerful features that have clear pros and cons. (For example, “query comprehensions” vs “fluent queries” are two very different-looking ways of building a query.)  Having a way to call a constructor the way you’d call any other method seems like having two ways of doing something — calling an initialization method — but without a compelling or interesting “payoff”. 

2) We’d have to add new language syntax to do it. New syntax comes at a very high cost; it’s got to be designed, implemented, tested, documented — those are our costs. But it comes at a higher cost to you because you have to learn what the syntax means, otherwise you cannot read or maintain other people’s code.  That’s another cost; again, we only take the huge expense of adding syntax if we feel that there is a clear, compelling, large benefit for our customers.  I don’t see a huge benefit here.

In short, achieving the desired construction control flow is easy to do without adding the feature, and there’s no compelling benefit to adding the feature. No new interesting representational power is added to the language.

Comments (28)

  1. The one situation where I’ve found this rule awkward is when:

    – I have to perform proessing on my arguments, and

    – I need to pass the result of that processing to my base, and also use it in my ctor

    Example:

    1. I have a constructor whose ctor takes in a XamlReader, which is a streaming (forward-only) interface.

    2. I need to deserialize the XAML, and pass the result to my base ctor

    3. But I also need to buffer the XAML, so I can deserialize it again multiple times.

    #2 or #3 are each trivially easy to do on their own; but since I can only read the stream once, and there’s no way to pass a buffer from my base invocation into the body of my constructor, I can’t do both.

    It’s not the end of the world; I just changed my constructor to take  a buffer, and I can always add a static factory method that takes a stream. But it makes the API a little more complex, and the factory method can’t be used by derived classes.

  2. Jonathan says:

    "the base constructor should never depend on the derived constructor having set up the derived state."

    As you explained in http://blogs.msdn.com/ericlippert/archive/2008/02/18/why-do-initializers-run-in-the-opposite-order-as-constructors-part-two.aspx

    this is not true because the base constructor can call virtual methods, whose overrides depends on fields initialized by the derived constructor.

    "In that scenario you can easily accomplish this by extracting the work done by the different constructors into methods, and then calling the methods in the constructors in whatever order you like."

    That doesn’t work if the constructor was initializing readonly fields. This solution cripples support for documenting and enforcing immutability.

    "We’d have to add new language syntax to do it."

    Or you could just use the existing syntax for calling a method, allowing that method to be another constructor.

  3. Robert says:

    Initializing read-only fields has always been the biggest problem with having initialization methods especially since C# doesn’t support parallel assignment and multiple return values. Maybe a InitMethodAttribute attribute that relaxes the readonly restriction but can only be called in a constructor?

  4. David Nelson says:

    As Robert said, extracting initialization logic into a commonly called method requires you to remove all of the readonly modifiers from your member variables. So its not really accurate to say that "no new interesting representational power is added to the language." On the other hand, I think a better solution to that problem would be to allow an "initonly" modifier on methods. Such methods can write to readonly variables as if they were in a constructor, but can only be called from constructors or other initonly methods.

  5. Robert says:

    I was just trying to avoid introducing new keywords into the language. An attribute would effect verification not parsing.

  6. configurator says:

    I like to think of it this way:

    Every constructor calls a base constructor – be it on the base class or on the same class. Eventually, for every constructor you choose, there is a constant array of constructors that will actually be called, starting with object.ctor(). The constructor for object is what actually creates the type and allocates memory, etc. (I know it probably isn’t really, but it’s a nice way to think of it). So in order to call another base constructor after a base constructor has finished is an error – there’d have to be two types for that to happen. And running code before calling the base constructor would be a worse case – the object is uninitialized. In my view of this model, this would mean there’s no ‘this’, and there’s no allocation for fields yet. In any model, this would mean I can call methods on the base class which would still be in an uninitialized case.

    I don’t think calling other constructors makes sense, except before your own constructor code. But I’m fully aware of the problem. Consider this:

    class Base {

    public Base(int a, int b) { … }

    }

    class Derived {

    // both a and b in this case should get the string parameter’s hash code

    public Derived(string s) : base(s.GetHashCode(), s.GetHashCode() { … }

    }

    Remember that GetHashCode is not cached, and could be quite a lengthy operation for long strings. Now it has to happen twice, just because I wanted to call my base constructor with the same argument twice. What if I had a long calculation that returns a Point but my base constructor only accepts an x and a y?

  7. Etienne says:

    Could it be also that you are required to call this(…) at the beginning just in case the chain of local constructors invoked end with one constructor that calls base(…)?

  8. Gabe says:

    I don’t understand why new syntax would have to be created. They already created special syntax in the form of base() and this(), so why not just allow that in the body of a ctor?

  9. Grico says:

    I really dont understand why new sintax would be needed. Why not allow the same sintax we use now but anywhere in the constructor body? base(…) or this(…)

    The biggest pain of not being able to do this is how to set up certain readonly variables. Sometimes you just have to give up on them because of this limitation. There are more cases where this design decision gets in the way but you can usually get around it with API modifications.

    A InitOnly attribute sounds pretty good but the learning curve is IMO steeper than just reusing existing syntax.  How many of us when learning C# have actually tried this(…) or base(…) somewhere inside the constructor body and frowned when it didnt work (I did at least :p). I actually think the learning curve would be pretty small in this case. Of course, the implementation process I have no idea and it probably is pretty expensive.

  10. Grico says:

    I really dont understand why new sintax would be needed. Why not allow the same sintax we use now but anywhere in the constructor body? base(…) or this(…)

    The biggest pain of not being able to do this is how to set up certain readonly variables. Sometimes you just have to give up on them because of this limitation. There are more cases where this design decision gets in the way but you can usually get around it with API modifications.

    A InitOnly attribute sounds pretty good but the learning curve is IMO steeper than just reusing existing syntax.  How many of us when learning C# have actually tried this(…) or base(…) somewhere inside the constructor body and frowned when it didnt work (I did at least :p). I actually think the learning curve would be pretty small in this case. Of course, the implementation process I have no idea and it probably is pretty expensive.

  11. Pavel Minaev says:

    It seems perfectly reasonable to me, from correctness perspective, to allow base/delegating constructor calls in the middle of a constructor, so long as any preceding code does not reference "this" or "base" in any way, either explicitly or implicitly. This can even be made to use the existing language for local variable initialization, and the associated reachability analysis, by saying that "this" is treated as uninitialized variable, and base/delegating constructor call initializes it.

    That said, Eric doesn’t seem to be making an argument that it is impossible, or even unreasonable; merely that it is not cost-effective investment of the team’s resources. Which is interesting; on one hand, I’d very much rather prefer to see "readonly class" (or something similar) in C# 5.0. On the other hand, as others have rightly noted above, readonly fields are precisely the case which is unnecessarily complicated with the existing, callable-only-at-method-entry constructor invocation syntax, so it could be treated as part of the same problem.

  12. Pavel Minaev says:

    > I don’t understand why new syntax would have to be created. They already created special syntax in the form of base() and this(), so why not just allow that in the body of a ctor?

    Just because it looks familiar doesn’t mean it’s not a new syntax. Currently, this(…) and base(…) are not expressions where they appear, for example, but they’d have to become that if they are to put in the body of the constructor. Then you have to come up with various new rules, such as what happens if someone tries to use "this" before calling base(…) – or, if you want to statically check against this, what are the rules for said static checking. Similarly, what happens if the code calls base(…) twice  – or what are the rules for static analysis guaranteeing that this won’t happen.

    It may look like "the exact same thing" from the first glance, but I can see how it could easily blow up into a rather large language spec change, with all that entails (don’t forget that aside from compiler implementation, there’s also IDE support – intellisense etc; testing of it all; writing documentation; and translating all that to all supported languages).

  13. Mike Greger says:

    configurator,

    I think it would make more sense to have a private constructor that takes an int and have it called by the constructor which takes a string:

    private Derived(int a) : base(a,a){ … }

    public Derived(string s) : this(s.GetHashCode()){ … }

    For the second situation I am imagining you mean something like this:

    Derived(Thingie t) : base(t.GetPoint().X, t.GetPoint().Y){ … }

    The same solution applies:

    private Derived(Point p) : base(p.X, p.Y){ … }

    public Derived(Thingie t) : this(t.GetPoint()){ … }

    Honestly I am having a difficult time imagining a situation where a series of constructors like this don’t solve most of the problems mentioned in the comments. Perhaps more tortured examples can be found, but I would begin to suspect the validity of the inheritance relationship in those cases.

  14. How do we validate arguments before calling the base class constructor? That has always bothered me in both C# and VB.

  15. Nathan Tuggy says:

    Jonathan,

    Create a validator function, and pass its return value (True if it validated successfully, False otherwise) to the constructor, along with the parameters. Obviously, you have to pass each parameter twice, but it could be worse.

  16. Mike Greger says:

    Jonathan,

    I would ask why the derived constructor needs to do additional validation on parameters before passing them to the base class constructor.

    The base class should be validating it’s own parameters and throwing on bad values. If it is not, then it is broken.

    If the derived class has additional restrictions then it should be acceptable to validate these after the base constructor has run and throw from there if the need arises.

    If that is not acceptable, for example: if the base constructor is time and/or resource intensive, then a factory method might be appropriate, although re-writing the base class to use lazier construction is probably a better idea in that case.

  17. Grico says:

    @Pavel Minaev

    You missed my point. Eric points out that introducing the new feature would imply a learning curve for the user because he would have to learn the new sintax in order to understand, debug or write new code. I dont agree, anyone who knows how to code in C# nowdays would understand base(…) or this(…) when encountered midway through a constructor body. So IMO eric’s reasoning does not stand in this particular issue.

    I do mention however that the implementation of such feature is surely expensive and I am fully aware that it’s not by any means trivial.

  18. David Nelson says:

    @Grico,

    The new syntax still carries a learning curve for the developer, because he has to learn at least some of the rules that would have to go along with it, as Pavel explained.

  19. Tom says:

    I don’t know, I always thought it was for consistency:  base calls the base constructor *before* the current constructor; this calls another constructor *before* the current constructor.  It’s always the other one *before* the current one, and this enforces a sort of logical consistency, as well as programmatic practicality (as this is exactly how I would expect both keywords to behave).

    The ‘readonly’ keyword I’ve always interpreted as a limited form of a public get/private set property, except you may only set it once, either in the initializer or in the current class constructor — though, I have not tested this, can you set a protected readonly field in a derived constructor?  Either way, readonly is simply a way to protect variables from being changed outside the current class while allowing them to be set in the constructor — in other words, a constant with limited mutability.  The fact we *can* assign a readonly variable in a constructor rather than *only* the initializer seems to me like a bonus.

    I know programmers in general are pretty lazy and expect a great deal more from their tools than the tool developers are willing or able to provide, but to me all this nitpicking about new features is rather silly because, as a programmer, half my job is figuring out how to work around tool and language limitations, and, call me crazy, but I consider that the *fun* part of my job.

    Not that I’m asking for more limitations, mind you.

  20. Tom says:

    >   Gabe said:

    I don’t understand why new syntax would have to be created. They already created special syntax in the form of base() and this(), so why not just allow that in the body of a ctor?

    My impression is that calling a constructor is the same as creating a new instance, so if you call this.ctor within a constructor, you’re effectively creating a new instance, and ‘this’ becomes meaningless because the constructor you just called no longer refers to ‘this’.  It works the same with base:  you’re creating a new base instance.

    Besides, modularizing functionality is kind of the whole point of object-oriented programming.  Why on earth would you cram all your functionality into a bunch of constructors when you can use methods?  Save the ‘readonly’ argument, but again, I consider that a privilege, not a right.  If you want a *readonly* value that can be assigned *outside* the constructor, it isn’t readonly.  Your logic is broken.

    This was probably not the team’s line of thinking when they implemented this limitation — or rather, failed to implement a workaround for this limitation — but I think it’s a solid argument.

  21. Mike Greger says:

    Tom,

    You mentioned: "…I have not tested this, can you set a protected readonly field in a derived constructor?"

    No it can not. The readonly field must be initialized in the base constructor or an initializer.

    Could someone post an example of how not being able to call a constrcutor in an arbitrary location affects readonly variables?

    I just can’t see it.

  22. Gabe says:

    Tom: The way I see it, the new operator creates an instance, and the ctor just initializes it. Remember, it’s not functionality that’s being crammed in, it’s initialization code. If initialization isn’t supposed to be a special operation, why are constructors so special in the first place?

  23. Of course, someone pointed out off-thread out that instead of using a static factory, I could just have the public ctor call a private overload. So scratch my comment above.

  24. Mark Knell says:

    @Mike

    > Could someone post an example of how not being able to call a constrcutor in an arbitrary location affects readonly variables?  I just can’t see it.

    AFAICT, the issues here are readability and learning curves. Assuming you want to preserve the convention that Eric mentions, of base classes not depending on derived classes, you can do everything you want with the current tools; it’s merely a question of how ugly (difficult to maintain or explain) you are willing make your code.

    Suppose you have a block B of code you want to execute inside your constructor, after another constructor in the same class.  Inside of invoking the second ctor partway through the first, you can extend your master constructor with enough additional parameters that it can conditionally execute block B, based on the parameters each "feeder" ctor sends.

    I’m not saying I’d recommend this.  As others have noted, when things get this ugly, it’s often a sign that it’s time to step back, reconsider, and refactor.

    Even the problem of readonly fields is tractable with current tools. Eric recommends "extracting the work done by the different constructors into methods, and then calling the methods in the constructors in whatever order you like."  If you’re like me, you first interpreted this to mean that each block of code would be moved wholesale, into a method that encapsulates the entire block–but there’s a middle ground.  You can extract methods that do the "work" of calculations, decisions, etc., but that do not do the final step of assigning state to fields.  Instead, you can return values from these methods and perform the readonly assignments back in the ctors.  Ta da–DRY methods and readonly access.

    Again, I’m not saying I’d enjoy this style, either.  But, there’s nothing mentioned so far that’s impossible with C# 3.0, from a mechanical perspective.

  25. Mike Greger says:

    Mark,

    The whole notion of doing the work in methods and returning values to be assigned to readonly fields seemed obvious to me. I guess that’s why I don’t see any problem. I certainly would consider a base class depending on a derived implementation to be a serious problem and refactor it immediately.

    I think some limitations on the way we code are actually helpful and should not be circumvented. They encourage better code by forcing the author puts more thought into the dependencies between different parts. Allowing code to work in every way imaginable leads to spaghetti…

  26. David Nelson says:

    @Mark, Mike,

    Of course you can extract the algorithm into a method and assign the results to a readonly variable. But what if you have multiple readonly variables, and you add a constructor? I hope you remembered to assign all of the variables correctly. Or what if the initialization algorithm normally generates the values for all of the variables in a single pass? Now you have to run the algorithm multiple times instead of just once, or create a structure for the sole purpose of returning the results to the constructor so they can be assigned.

    The point is that there are numerous cases today where you have to sacrifice immutability or readability because of the restrictions of the language on constructors. I am NOT advocating allowing constructors to be called within a method body, even another constructor body, as I believe that would cause more problems than it would solve. However, I do think initialization methods such as I described earlier would be very useful. But I am not holding my breath.

  27. Mike Greger says:

    David,

    I still disagree.  I’m not sure I understand your comment about remembering to assign all the variables correctly. If you are adding a new constructor…why not just call the existing constructor from yours?

    As for an initialization algorithm that generates values in a single pass: Either the algorithm should be split into independent methods that return individual values or, if the values and the algorithm are so intertwined as to preclude this, then I would make the argument the result should indeed be a struct as the values are obviously tightly related.

    I realize you are not advocating the current restriction be changed, but I don’t see that anyone has made a compelling case that the existing restrictions are even an impediment…providing the design is carefully considered.

    Can I come up with a class that is difficult to initialize properly given the current language restrictions? Sure. Can I make a good case for designing such a class? Probably not.

  28. Greg says:

    A derived constructor calling the base constructor fails our code review.

    It’s clever code but hard to debug when combined with similar techniques throughout a large system.  Such a system gets reputation of being difficult to maintain / debug.