The JScript Type System, Part Three: If It Walks Like A Duck…


A
reader asks “c
an
you explain the logic that a string is not always a
String but
a regexp is always a
RegExp?
What is the recommended way of determining if a value is a string?”

 

Indeed,
you are correct:

 

print(/foo/
instanceof RegExp);             //
true

print(new
RegExp(“foo”) instanceof RegExp); // true

print(“bar”
instanceof String);             //
false

print(new
String(“bar”) instanceof String); // true

print(typeof(“bar”));                       //
string

print(typeof(new
String(“bar”)));           //
object

 

Why’s
that?

 

First
off, the question about strings.  In JScript
there is this bizarre feature where primitive values — Booleans, strings, numbers
— can be “wrapped up” into objects.  Doing
so leads to some bizarre situations.  First
off, as you note, the type of a wrapped primitive is always an object type, not a
primitive type.  Also, we use object equality,
not value equality. 

 

print(new
String(“bar”) == new String(“bar”)); // false

 

I
highly recommend against using wrapped primitives.  Why
do they exist?  Well, the reasoning has
kind of been lost in the mists of time, but one good reason is to make the prototype
inheritance system consistent.  If “bar”
is not an object then how is it possible to say

 

print(“bar”.toUpperCase());

 

?
Well, actually, from the point of view of the specification, this is just a syntactic
sugar for

 

print((new
String(“bar”)).toUpperCase());

 

Now,
of course as an implementation detail we do not actually cons
up a new object every time you call a property on a value type!  That
would be a performance nightmare.  The
runtime engine is smart enough to realize that it has a value type and that it ought
to pass it as the “this” object to the appropriate method on
String.prototype and
everything just kind of works out.

 

This
also explains why it is possible to stick properties onto value types that magically
disappear.  When you say

 

var bar
= “bar”;

bar.hello
= “hello”;

print(bar.hello);
// nada!

 

of
course what is happening is logically equivalent to:

 

var bar
= “bar”;

(new
String(bar)).hello = “hello”;

print((new
String(bar)).hello); // nada!

 

See,
the magical temporary object is just that — magical and temporary.  Once
you’ve used it, poof, it disappears.

 

But
this magical temporary object does not appear when the typeof or instanceof operators
are involved.  The instanceof operator
says “hey, this thing isn’t even an object, so it can’t possibly be an instance of
anything”.  For both consistency and usability,
it would have been nice if
“bar”
instanceof String
created
a temporary object and hence said yes, it is an instance of String.  But
for whatever reason, that’s not the specification that the committee came up with.

 

Second,
your question about regular expressions is easily answered now that we know what is
going on with strings.  The difference
between regular expressions and strings is that regular
expressions are not primitives
.  Just
because you have the ability to express a regular expression as a literal does
not mean that it is a primitive!  That
thing is always an object, so there is no behaviour difference between the compile-time-literal
syntax and the runtime syntax.

 

Third,
your question about how to determine whether something is a string is surprisingly
tricky.  If
typeof returns “string” then
obviously it is a string, end of story.  But
what if
typeof returns “object”
how can you tell if that thing is a wrapped string? 

 

It’s
not easy.  
 instanceof
String
doesn’t
tell you whether that thing is a string, it tells you whether
String.prototype is
on the prototype chain.  There’s nothing
stopping you from saying

 

function
MyString() {}

MyString.prototype
= String.prototype;

var s
= new MyString();

print(s.constructor
== String);            //
true

print(s
instanceof String);                //
true

print(String.prototype.isPrototypeOf(s));  //
true

 

So
now what are you going to do?  JScript
is excessively dynamic!  Basically you
can’t rely on any object being what it says it is.  JScript
forces people to be operationalists.  (Operationalism
is the philosophical belief that if it walks like a duck and quacks like a duck, it is a
duck.)  In the face of the kind of weirdness
described above, all you can do is try to use the thing like a string, and if it acts
like a string, it s a string. 

 

Comments (8)

  1. Jay Hugard says:

    Verry interesting. I had most of that already but was thrown off by the "magic temporary" construction.

    Thanks!

  2. Blake says:

    Eric writes, "…cons up a new object…"

    Busted. We knew there has to be a Lisp geek hiding in there.

  3. Eric Lippert says:

    Yeah, I knew someone would bust me.

    In truth, I have never written any nontrivial programs in Lisp, though I did a fair amount of Scheme programming when I was at UW. I just use the expression "cons up" to get geek cred from hard core lisp freaks.

    Whoops, did I say that out loud?

  4. Anonymous says:

    > all you can do is try to use the thing like a string, and if it acts like a string, it s a string

    Only problem here is that most anything can come out looking like a string, given the toString() method. Maybe ECMAScript should just go the way of TCL – declare everything a string a be done with it 😉

    Oh, and also thanks for the "magic temporary". Just when you think you know everything about JavaScript, it throws you a curveball. BTW, this "magic temporary" sort of brings C++ anonymous temporaries to mind.

  5. Dan Shappir says:

    My comment above – again Remember Me doesn’t seem to work.

    After some more though I should have realized the "magic temporary" behavior. If for nothing else then the example I gave for part one:

    Number.prototype.showType = function() { alert(typeof(this)); }
    (3).showType();

    I even made a comment about auto-boxing.

    The reason I did not think through the full implications of this is that I was caught up in the concept that in ECMAScript everything is an object. Must say I’m somewhat disappointed that this isn’t really the case (even though in most cases you really can’t tell the difference).

    BTW anybody who has used Netscape’s LiveConnect (at least version 4.x) knows the situation there is even worse. Strings returned by applets are of the type java.lang.String, which JavaScript identifies only as objects, and not even string objects. You need to apply the String function explicitly to these objects if you want to manipulate them using JavaScript. Rhino OTOH handles this scenario very well.

  6. Dan Shappir says:

    As I’ve pointed out in a comment to a previous post, JavaScript uses duck typing, that is it matches functionality by name (of property). In this respect the "real" type of an object doesn’t mater very much, only the functionality it exposes. Consider the following example:

    function foo() { return { hello : "world" , bye : "everybody" }; }
    var a = foo();
    var b = foo();
    var c = new Object();

    Do the objects referenced by a, b and c have the same type? If your answer is "yes, they all have the type Object" you miss the fact that a and b share a common structure. However, by any other criteria mentioned in the posts so far, a and b have nothing in common.

    To take it farther, what if I tack another property on to b:

    b.x = "y";

    The fact that JavaScript objects are so malleable, and can be modified both internally and externally after they have been created, makes much of the type related info irrelevant.

    It is a shame, however, that ECMAScript does not provide a means to do operator overloading or create special properties like String’s length. This would make it possible to create ADTs that are functionally equivalent to the internal types.

  7. Anonymous says:

    > duck typing
    Maybe we should read this as "ducks out of typing" 😉

    > much of the type related info irrelevant
    So why don’t we just call JScript an untyped (not-typed) language?
    Languages without data types are called "typeless" (BCPL, MCPL)

    JScript seems to have types but is not typed.

  8. Centaur says:

    > if it acts like a string, it’s a string.
    And if it acts almost, but not entirely, unlike a string, it’s something that is almost, but not entirely, unlike a string, but will be mistaken for a string if the feature that is tested happens to fall in that “not entirely” exception category 🙂

Skip to main content