Language Inferencing

Article
02/12/2004

I've been involved in a lot of debates over the last few years over the merits of strongly-typed languages versus loosely-types ones. For example, vbscript and java script are generally loosely-typed languages. Local variables defined within a body of code generally obtain their 'type' at runtime as a side-effect of an assignment, and its possible that an assignment two statements later re-assigns the variable to an instance of another type.

a = 10 'Here 'a' is an integer

a = “ten” 'Now 'a' is a string

Lot's of programs are written this way. There are tons of ASP applications written using VBscript and tons more using languages such as jscript, etc. There are even languages where there are no types at all. Instance values are just bags of named properties.

C# is strongly-typed. You must declare the type of a variable up front. The compiler will keep you to your word. The idea is that by strong typing the compiler will catch numerous bugs for you, where you accidentally put the wrong bit of data in the wrong variable. There are numerous other benefits from strong-typing such as static analysis of the program, early-binding and many optimizations that can be made.

But still many people prefer loosely-typed systems. They are just easier to get started with. You don't have to keep repeating yourself by declaring types names all over the place. You just start writing code.

Recently, there has been a lot of talk over using a technique called “Type Inferencing“ to lessen the burden of always writing type names in strongly-typed languages. While we have been discussing c# here, there have been suggestions made for c++ and even Java, and so on. (And yes I am aware of many languages that already do this.)

Personally, I love the idea of type inferencing. I've been pushing for it for a while. Still, there's so much more that can be done to improve the programming experience, type inferencing is just a start. I would like to tackle some of these.

If we alleviate much of the burden imposed by overly declarative languages and still maintain code correctness we will have done a HUGE thing. Everyone will benefit.

What I really find burdensome about programming is getting all the formalities right. You've got to setup your constraints and other structural bits before getting in and writing the meat of your code. If you could just skip this step it would save a lot of time.

First of all, get rid of all the using and namespace declarations out of C#. The compiler can figure those out later. If you reference a type, let the compiler figure you where it came from.

Next, don't bother defining a class to put your code in. Just write the code. If you don't write a class, let the compiler invent one for you. It can infer a silly class and Main method much quicker than it takes to type one up yourself.

So now you can just start writing code.

a = 10;

That's a good start. But why do I need to keep writing semi-colons. The language syntax ought to be smart enough to determine the end of an expression without me slamming these silly 'tweeners' in there.

a = 10

Much better.

Now that I'm on a roll, lets get into some really qualitative improvements. Why waste all the effort setting up flow structures like if/else. All we need to do is introduce the concept of 'FAILURE' into the language. If a statement or expression fails, it just doesn't have its intended effect.

For example.

a = f(xxx)

a = g(yyy)

Given both these statements, either 'a' is assigned by the first statement or it is assigned by the second statement. Which ever doesn't fail, succeeds.

The next thing we want to get rid of is looping constructs and especially statement sequencing. I spend so much effort getting these things right. The compiler should be able to figure this stuff out with a simple dependency graph.

a = b + 10

b = 20

Now, its easy to see that unless b is assigned first, then a will not succeed. Therefore, the assignment to b should execute first.

For loops, we just adopt the mathematics notation of subscripting.

a(n) = a(n-1) + 10

Now its obvious that a loop must be written to perform this calculation. Let the compiler do it!

You can see where I'm going here. Type inferencing leads you to the logical conclusion of full language inferencing, which is exactly where I think the industry should be moving.

And while were at it, lets get rid of variable names. I think I must spend at least 25% of my time thinking up good variable names. If we didn't have to name them, I'd be done so much faster.

I know, by now you are thinking, this guy is NUTS. What an incredibly bad bunch of thinking, I can see all sorts of holes here. This would never work. That may be so, but it might just happen anyway. Check out the XML Conspiracy.

THIS JUST IN: It's already a happening! Check this out!

But I digress

Matt

Language Inferencing

Additional resources