Difficulties with non-nullable types (part 3)


In the last post on this topic i talked about how it was critical that
by the time a non-null type was used it was provably assigned some
non-null value.  The cases that came up were specifically instance
fields in classes.  In that post i showed how would could change
how types were initialized to ensure that that these non-null
constraints could be met.



However, there is another type of field out there: static
fields.   And it turns out that these fields are  much
harder to deal with than instance fields.  Why?  Well, let’s
take a look at a simple example:

class Person {
public static string! name;
}

How do we initialize that field?  Well, we could start with a static constructor like so:

class Person {
public static string! name;

static Person() {
a = new Cyrus().name;
}
}

class Cyrus {
public static string! name = "cyrus";
}



Ok.  Seems like this would work out and you’d end up with all
non-null “name”‘s being assigned a value.  However, let’s extend
that a little further:

class Cyrus {
public static string! name = "cyrus";

public Cyrus() {
Cyrus.name = Person.name;
}
}



Uh oh.  Now we’ve got a problem.  Everything looks typesafe,
but when the Person class loads, it will call the ‘Cyrus’ constructor
which will then access the “Person.name” field before it has been
initialized!  Not good!



As it turns out this is the same problem as the instance-field issue
from the last post.  As i mentioned there, determining the set of
code that is dependent on a constraint is extremely difficult and so we
need a more restrictive set of rules that we can work with in order to
be able to reason about this properly.



In order to support the instance fields case we placed the restriction
that before the “this” pointer could be used, it has to be the case
that all the non-nullable fields had been assigned to.  Can we
extend that idea out to static fields?  Since there is no “this”
pointer we would have to say something like “before you can use the
actual type containing the static field, all fields must be assigned
to”.  



This is a much more limiting set of circumstances, and far harder to
enforce.  With instance fields you at least have access to a rich
set of inputs to initialize them with.  You can use static data in
the system, or you can use parameters passed into a constructor. 
With static fields things are not so clear cut.  As you can see
above, arbitrary code can’t be run as that code might statically access
the type being constructed.  Also, as static constructors don’t
take any parameters it’s hard to find inputs to assign to these fields.




So what can be done about this?  It’s not clear to me what the
best solution is.  The easiest solution is to simply disallow
static non-nullable types.  But that would suck since i think that
people really want them.  The next easiest would be to have
non-nullable static fields, but guard all read/writes to them to check
for null.  That would put the onus on the programmer to ensure
that it’s value is set before use, but it means that suddenly accessing
this field could fail at runtime.  Ugh*2.  The final thing
that i can think would be a solution like the following inside the
runtime:


  1. When loading a type that has non-nullable static fields, start executing the static cosntructor
  2. When the static constructor enters, set a bit saying that the type
    is currently being constructed.  Then, if a static non-nullable
    field in that type is ever accessed by anyone else, throw an exception
    saying that this type wasn’t fully initialized and it’s invalid to
    access non-null static state yet.
  3. If the code executed in the static constructor then causes something
    like the above situation to happen, then you’ll immediately fail at
    typeload time.

This has the unfortunate outcome that the checking is still done at
runtime.  however, it will most likely occur early on when all
your types are being loaded, and not much later on the first access of
one of these fields.



In essence, what we’ve done is enforce a quasi ordering on how types
are loaded in the runtime.  As a type loads it can access static
non-nullable fields in any fully loaded type, but not in a type that is
still being loaded.



It’s ugly, but i can’t think of anything better.  Do you guys have any ideas?


Comments (21)

  1. Gatsby says:

    You could make it illegal to reference a non-nullable static within a static constructor, thus breaking the dependency cycle.

  2. CyrusN says:

    Gatsby: If we didn’t allow access to the static fields in the static constructor, then how would we initialize the fields?

    Note, if you have:

    public static string! name = new Cyrus().name;

    then that code in the initializer just goes into the static constructor.

    So, without the static constructor, there is no way to initialize the static non-nullable fields. And if you don’t initialize them, then they certainly won’t be "not" null :)

  3. CyrusN says:

    Gatsby: If we go down that path, that seems like my first option which is to just not allow non-nullable statics.

  4. Gatsby says:

    Sorry, let me make that more precise: "In a static constructor, you cannot have anything non-nullable in the RHS of an expression, which cannot be proved to be non-null." This doesn’t stop you from assigning to A.staticNonNullableVar in the static constructor of A.

    However one variable that can’t be proved to be non-null in your example is Person.name, namely because we can’t know at compile time that Person hasn’t been initialized yet. So in general, referencing non-nullable statics of other classes would not fly.

    Make sense?

  5. Gatsby says:

    And by ‘RHS of an expression’ I meant right hand side of an assignment operation. :)

  6. CyrusN says:

    Gatsby: So how do you initialize the value then? :)

    If you can’t have a non-null value on the rhs, and that’s *exactly* what you need to assign into the static field… then it seems like you’re S.O.L. :)

  7. Well, I suppose that assigning an external non-nullable or literal isn’t the only way to initialize the field. What about

    string! a;

    static Class () {

    string x = Foo ();

    if (x != null)

    a = x;

    else

    a = "error?";

    }

    I guess this kind of assigning to non-nullable types is a *must*, otherwise the only values inside the non-nullable types would have to be literal… or values obtained from external function, which would break static verifyability.

    I’m a *great* fan of non-nullable types, because they allow to natually use the framework to implement ML-like option[‘a] type. Simply None() becomes null and Some(x) becomes x – no need to create a wrapper. Of course we can do it now, but obviously we cannot expect compiler to forbid/warn about usage of A as A! (that is fetching element from option value).

    If entire BCL were decorated with non-nullable types, then we could utilize nullable types as ‘unsure’ values, which compiler would force/suggest to check before usage.

    StringBuilder! x = StringBuilder ("");

    string! s = x.ToString ();



    string line = buffer.ReadLine ();

    // warning, line might be null

    Console.WriteLine (line.Substring (1));

    This is one of the those changes, which must be made *globally* in order to be usable, but then they will rule :-) (just like generics)

  8. Ron Buckton says:

    You could require that all non-null reference types are assigned a default value at declaration. It would incur a penalty of the overhead of instantiating a possibly unneeded reference type but all of these other issues would be moot.

    Possibly not the best suggestion but its worth considering.

  9. CyrusN says:

    Kamil: "Well, I suppose that assigning an external non-nullable or literal isn’t the only way to initialize the field. What about

    static string! a;

    static Class () {

    …string x = Foo ();

    …if (x != null)

    ……a = x;

    …else

    ……a = "error?";

    }

    "

    The problem with this is that Foo() could execute arbitrary code. That code could then end up trying to use "a" before it had been initialized. Not good!

    "I guess this kind of assigning to non-nullable types is a *must*, otherwise the only values inside the non-nullable types would have to be literal… or values obtained from external function, which would break static verifyability. "

    Yup. And that limits the usefulness if we only allow non-null literals. That means that all you can pretty much have are non-null strings :(

  10. Joe Duffy says:

    This is something Haskell got right and C[++|#|blah] got wrong. Nullability should be completely opt-in. Maybe<T> = Just T | nil works for me.

    Nullable<T> is close. But not quite.

    With that said, I love the idea of pre-/post-/invariant-condition annotations to the C# syntax. This is a great use of that style of programming.

    Of course, flippant, off topic crap like this probably isn’t helpful. Nonetheless, great set of posts. Keep it coming.

  11. Joe Duffy says:

    By the way, if it wasn’t obvious: the "flippant, off-topic crap" comment above was in reference to my comment–not your post. :)

  12. CyrusN says:

    Joe: "This is something Haskell got right and C[++|#|blah] got wrong. Nullability should be completely opt-in. Maybe<T> = Just T | nil works for me."

    I have to disagree with taht Joe. The Haskell style of Maybe (also known as "Optional", or in

    other funtional languages) is a distinctly differnt concept than Nullable vs. Nonnullable in OO languages. In OO Nullable follows the "is a" relationship. So Nullable<T> is-a T. In Haskell/Scheme/Ocaml, this is not the case with Maybe. A Maybe<int> is not an int.

    Note: in C# i have my own "Maybe" class (which i call "Optional"). I blogged about this before on this blog. However, it serves a very different purpse than nullable.

    "With that said, I love the idea of pre-/post-/invariant-condition annotations to the C# syntax. This is a great use of that style of programming. "

    Cool :)

  13. So far i’ve discussed two interesting problems that arise when you try to add non-nullable types to C#…

  14. joc says:

    In static constructors you could disallow access to static non-nullable fields (as well as static non-nullable properties, functions with static non-nullable return types or non-nullable out and ref parameters) of other types: e.g. inside the static constructor of Cyrus, the static fields of Person would not be accessible.

  15. Joe Duffy says:

    Cyrus, I know nullability in OO differs from optionality in most functional languages. My point is that optional makes a helluva lot more sense than nullable baked into the type system.

    Null of T should never be considered to be a real T. That’s ludicrous–it isn’t true, first of all… I think type systems should not be in the business of lying–and gives way to crap like NullReferenceExceptions.

    I prefer to fix these styles of errors in the static type system than let dynamism run a good program at runtime. But nowadays, that seems to be an old skool point of view.

  16. CyrusN says:

    Joc: "In static constructors you could disallow access to static non-nullable fields (as well as static non-nullable properties, functions with static non-nullable return types or non-nullable out and ref parameters) of other types: e.g. inside the static constructor of Cyrus, the static fields of Person would not be accessible."

    If you disallow all that… then how do you initialize your static fields? :)

    You’re basically disallowing me from touching any type that might transitively touch a type with a static initializer. This will be very constraining.

  17. CyrusN says:

    Joe: "Null of T should never be considered to be a real T. That’s ludicrous–it isn’t true, first of all…"

    Well… i compeltely agree :)

    Sorry if i ever let you think otherwise. However, non-null of T should be considered to be a real T.

    "I think type systems should not be in the business of lying–and gives way to crap like NullReferenceExceptions."

    And Some of T and nil give way to "MissingMatchException" in functional languages :)

    "I prefer to fix these styles of errors in the static type system"

    So do i. that’s why i’d like to bring good old non-null types to C# and let people use Optional/Maybe to allow for null values. However, i’m running into many problems trying to do that, and i thought i’d share it with you. It sounds like we feel very similarly about this stuff, and i’m trying to get a ebtter solution so that if you *have* to use C# it doesn’t end up totally pissing you off :)

    "than let dynamism run a good program at runtime. But nowadays, that seems to be an old skool point of view"

    Well… i’m trying to remove dynamism. but it’s hard when the entire system is based around it :(

  18. Joe Duffy says:

    Cyrus, cool… think we’re on the same page. Although we may differ in that I think nullability should be opt-in rather than opt-out.

    In Haskell for example, forcing people to unbox a "Maybe of a" makes them realize they have to deal with the null case, or pushes it onto the caller if it understands only an "a"… (i.e. not nullable) And with pattern matching it becomes so much simpler than doing null branch testing to avoid sporadic NullRefExceptions.

  19. joc says:

    Joc: "In static constructors you could disallow access to static non-nullable fields (as well as static non-nullable properties, functions with static non-nullable return types or non-nullable out and ref parameters) of other types: e.g. inside the static constructor of Cyrus, the static fields of Person would not be accessible."

    CyrusN: "If you disallow all that… then how do you initialize your static fields? :)"

    With your static constructors! Person’s static constructor is allowed to initialize Person’s static non-nullable fields and Cyrus’ static constructor is allowrd to initialize Cyrus’ static non-nullable fields but you may not use the former to initialize the latter and conversely.

  20. There is no-doubt that the C#2 nullable-types is a cool feature. However I regret that C# don&#39;t support