Fast Switching with LINQ

Article
03/28/2007

Jomo Fisher—I’ve run into a performance problem several times in my career that I’ve never found a truly satisfying answer for until now. The problem can be distilled down to this:

Look up a value based on a string key. The set of strings is fixed and no unknown key values can occur.

Consider my personal rankings of the Seven Dwarves on a scale of 1-10:

happy=9, sneezy=2, doc=7, sleepy=8, dopey=9, grumpy=2, bashful=6

I need a function that takes a name and returns a ranking. The default solution for this problem should be to use a Dictionary or HashTable. The implementations that ship with the .NET framework are extremely fast.

However, these container classes offer more features than I need for this problem. New keys can be added but my set is fixed. They can tell you whether a key is valid or not but I know a priori that the keys are limited to a fixed set. They can enumerate the set of keys or the set of values but I don’t need this capability.

Any time I see a data structure with a capability I’m not using it makes me wonder whether I can trade that capability for something I do need—in this case a speed boost.

If I was really desperate for a performance gain then it was possible to write a function like:

static int Lookup(string s) {

if (s.Length < 6) {

if (s.Length < 5) return 7;

else return 9;

} else {

if (s[0] < 's') {

if (s[0] < 'g') return 6;

else return 2;

} else {

if (s[1] < 'n') return 8;

else return 2;

}

Notice that we only have to look at the length and the first two characters of the string to make the decision. This function is fast but nearly impossible to maintain or debug. Also, I have to know the set of strings at compile time. All of these drawbacks make this approach significantly less useful.

Enter the LINQ Expression Compiler that will be available in the .NET 3.5. With the new expression support it’s possible to write an expression at runtime and then compile this expression into straight IL (which will be jitted into machine code). It turned out to be pretty easy to write a switch compiler that would generate an appropriate lookup expression on the fly. I wrote a class called SwitchCompiler to do the heavy lifting. Here’s how it’s called:

var dict = new Dictionary<string, int> {

{"happy", 9}, {"sneezy", 2},

{"doc", 7}, {"sleepy", 8},

{"dopey", 9}, {"grumpy", 2},

{"bashful", 6}

};

var Lookup = SwitchCompiler.CreateSwitch(dict);

var v = Lookup("doc");

Debug.Assert(v == 7);

SwitchCompiler first builds up an expression that looks like this:

s.Length < 6 ? (s.Length < 5 ? 7 : 9) :

(s[0] < 's' ? (s[0] < 'g' ? 6 : 2) : (s[1] < 'n' ? 8 : 2));

And then compiles this into a delegate returned into Lookup. (It should also be possible to generate a perfect minimal hash in the Compile method. I haven't tried this).

The actual implementation of SwitchCompiler is about 65 lines of code. You can see it in the attachment.

On my machine, looking up each value in a tight loop 1 million times takes 70ms while the corresponding loop using Dictionary takes 697ms. That’s close to a 900% performance improvement.

This posting is provided "AS IS" with no warranties, and confers no rights.

SwitchCompiler.cs

Fast Switching with LINQ

Additional resources