Difficulties with non-nullable types (part 2)

Article
04/25/2005

The fundamental difficulty that arises when trying to implement non-nullable types is actually enforcing that the value is not ever null. With value types, this is ensured by having a strict rule that there must exist a public no-arg constructor that does nothing. This restriction is ok for certain value types (like the core primitives like integer, boolean, etc.), but is often quite aggravating when dealing with more complex value types. In these cases you often have to code to this pattern:

 public struct ComplicatedValue {
    bool initialized;

    public ComplicatedValue(SomeArguments args) {
        //initialize this struct
        initialized = true;
    }

    public void DoSomething() {
        if (initialized == false) {
            throw new UnsupportedOperationException(“You can only call this method once you have initialized this struct”);
        }

        //Do Something
    }
}

With that pattern you’ve moved all the checking right back to runtime. This pattern does have the nice benefit that all that checking is contained within this class (as opposed to the person who consumer ComplicatedValue), but it’s still not very pretty. If we were to require this for reference types people would go nuts. So that means we need to coexist with the current ways that people implement reference types.

So, now lets look at a scenario involving non-nullable types:

 public class Person {
    string! name;

    public Person(string! name) {
        this.PrintSomeDebugStuff();
    }

    public void PrintSomeDebugStuff() {       
        Console.WriteLine("Debugging Info: " + name.ToString());
    }
}

Well that’s not good! Because we haven’t assigned a value to name yet and we’re going to throw an exception because “name” is null even though we’ve declared that it’s not null! The problem here is that “this” instance was allowed to be used for general execution before all variables in “this” were set up according the constraints that were listed. If a class lists constraints on its members then it’s imperative that we ensure they are fulfilled before allowing general execution to continue. Note: we could ask for a looser type of restriction. Specifically, we could say that all constraints needed to be satisfied before executing any code that depended on that constraint. Unfortunately, determining what is the set of code that is dependent on a constraint is extremely difficult, and so it suffices to put the more restrictive system in place.

So, the compiler would flag the above code as illegal because not all non-null fields had been initialized before other code was executed that used the “this” reference. Instead, you would have to write something like:

     public Person(string! name) {
        this.name = name;  //or
        this.name = "foo"; //or
        this.name = SomeExpressionThatReturnsANonNullStringButDoesntUseThis;
        this.PrintSomeDebugStuff();
    }

Ok. Seems pretty simple write? Well, there are a couple of little “gotchas” to be aware of. Consider the following code:

 public abstract class Base {
    public Base() {
        this.Setup();
    }

    public abstract void Setup();
}

public class Derived : Base {
    string! name;

    public Derived(string! name) {
        this.name = name;
    }

    public override void Setup() {
        Console.WriteLine("Debugging Info: " + name.ToString());
    }
}

If you just look at “Derived” it all looks good. The constructor ensures that all fields are initialized before any other code is executed. Right? Nope. In C# the constructor is not run until the supertype’s constructor is called. i.e. you have:

     public Derived(string! name) : base() {
        this.name = name;
    }

and Base’s constructor is executed before Derive’s is. So in the above example “Setup” will be called and will try to access “name” before it is actually initialized.

Now, in order to prevent this we would have to add an extension to C# constructors to get around this problem. Basically, we would need to give you a way to ensure all class invariants were ensured before being allowed to call the supertype’s constructor. Perhaps something like this:

     public Derived(string! name) {
        this.name = name;
        base(); //base constructor call has moved to after the initialization of fields
    }

There would be special restrictions in place in this code region before “base()” is called. No access to “this” pointer, except to assign into fields, and no access to non-null fields of the supertype (since they haven’t been initialized yet).

Now we’re all set. We can enforce our class constraints and ensure that if anyone has access to the “this” pointer that our invariants have been met.

Ok. So that’s one problem with non-null types addressed. A few more yet to come!

---

Edit: I forgot to mention this in my post (which is usually what happens since i don't plan these and just write in a free flow manner), and DrPizza astutely noticed this: These modifications would bring C#'s intialization model more in line with C++'s. Like him i find that model to be far more sane. One thing that makes a whole lot of sense to me is that while initializing a base type, you do not have access to the members of the derived type. And, if i'm not mistaken, there are guidelines out there that a constructor should not call virtual methods in .Net (because of some security concern if i'm not mistaken). So, if we're recommending against using that capability, i'm not sure why it's there in the first place. One thing I don't like is how C++ initialization lists look. I'd like to come up with a nicer looking syntax for that.

Difficulties with non-nullable types (part 2)

Additional resources