JScript .NET Type Coercion Semantics

Stan Lippman has started an interesting series of blog entries on how Managed C++ determines which function foo to call when you call foo(bar) given that there may be several visible functions of that name in the current scope. That's quite coincidental, as just the other day I was going through some old mail when I found a long document I wrote up on how JScript .NET does this.

In the unlikely event that anyone out there is interested, I think I'll post bits of it over the next few days. The compare-and-contrast with C++ might be edifying. (Yes, this is going to be pretty geeky, even for me.)

To begin with, I want to start by tackling a different question -- how does JScript .NET handle the situation where a value of one type is assigned to a field, where the field has a type restriction? For instance, how do we handle the situation

var blah : String = 123;

? In particular, we are concerned with the implicit type coercions. Explicit type coercions like

var blah : String = String(foo);

Are obviously less interesting, as the compiler does not have any particular work to do -- it just lets the run-time coercion function do its work.

Recall that a type is defined as a set of values, plus a set of rules for converting any values not in that set to a value in the set. (That rule may include "throw a type mismatch exception" of course!) For a quick refresher on the JScript and JScript .NET type systems in general, I wrote a whole series of articles starting

here.

Under what circumstances are coercions performed?

The JScript .NET compiler can perform some type coercions at compile time. For instance, compile-time constants such as the above case can be converted by the compiler. But that's a pretty rare case. The more common case is that the types are known at compile time but the values are not. (In the worst possible late-bound case the types are not known until runtime and special late-bound helper functions must be invoked. In general, the late-bound functions should have the same behaviour as if the types and values were known at compile time.)

There are many situations in JScript .NET where the compiler must generate a type coercion. The most common cases are:

  • An explicit conversion (see above)
  • When assigning a value to a typed variable
  • When passing arguments to a function with typed arguments
  • When indexing an array
  • When setting a typed global value, local value or member value using an assignment operator (=, +=, etc).
  • When a function with a type annotation on its return type returns a value
  • Some operators cause arguments to be coerced, eg, (foo + "") coerces foo to string. Bitshift operators coerce arguments to 32 bit integers, etc.

There are also a few more obscure scenarios:

  • When passing arguments to a custom attribute.
  • If a JScript array is coerced to a System.Array then each member of the JScript array is coerced to the element type of the System.Array.
  • If a System.Array is coerced to a JScript array then we wrap the System.Array with a JScript array wrapper. Setting values on the JScript array wrapper automatically coerces the value to the element type of the underlying System.array. (See here, here, here, here and here for more information about the relationship between the two kinds of array.)
  • When subtracting anumber from a char the result is coerced to char -- I'll do another blog entry sometime on all the interesting semantics of the char type in JScript .NET.

How do we determine at compile time when an implicit coercion is legal?

I need some definitions.

By "VALUE is coercible to TYPE" I mean that the specific VALUE may be coerced to TYPE without data loss or error.

For example, the value 0.5 is coercible to the type "32 bit floating point number" but not to the type "32 bit signed integer". (There are a few exceptions, in that some lossy coercions are still considered coercible -- numbers are considered coercible to Boolean, for instance.)

By "TYPE_1 is promotable to TYPE_2" I mean that every member of TYPE_1 is coercible to TYPE_2.

For example: the type "32 bit signed integer" is promotable to "64 bit float" but not promotable to "32 bit float". Note that this implies that any derived class is promotable to one of its base classes, and value types are always promotable to their boxed types.

By "TYPE_1 is assignable to TYPE_2" I mean that there exists at least one member of TYPE_1 which is coercible to TYPE_2.

Clearly, if TYPE_1 is assignable to TYPE_2 but not promotable then there must be some member of TYPE_1 which is lossy or produces an error when coerced to TYPE_2. The JScript .NET compiler will often issue a warning if the program contains an assignment entailing an implicit coercion between types known to be assignable but not promotable. For example: since zero is common to all numeric types, all numeric types are assignable to each other. A base class is assignable to a derived class and vice-versa. But Array is not assignable to Number.

Finally, a definition that will come in handy tomorrow: consider two typed array types, TYPE_1 and TYPE_2, where their elements are of type ETYPE_1 and ETYPE2, respectively. By "TYPE_1 is element-type-compatible (ETC) to TYPE_2" I mean:

  • If either ETYPE_1 or ETYPE_2 is a value type then TYPE_1 is ETC with TYPE_2 if and only if ETYPE_1 = ETYPE_2. int[] is not ETC with long[]
  • If neither ETYPE_1 nor ETYPE_2 are value types then TYPE_1 is ETC with TYPE_2 if and only if ETYPE_1 is promotable to ETYPE_2.

OK, enough definitions. Why "assignable"? Because generally speaking, we can consider the pattern

Left Hand Side Expression = Right Hand Side Expression;

as the canonical situation in which a type coercion is performed. When passing arguments to function calls, that's basically a kind of assignment. From now on we'll just talk about assignment coercions.

The JScript .NET compiler determines the types of the LHS and RHS expressions at compile time as best it can and then checks to see if the RHS expression's type is assignable to the LHS expression's type. If is it then it generates an implicit coercion. If it's assignable but not promotable, that's usually a warning, and if it is not assignable at all then that's usually a compile-time error.

Next time I'll go into much more detail as to how the compiler determines promotability, assignability and coercibility.