switches and jump tables

In my last post I had discussed about how only constants can be used with C# switches. From the post's comments and later discussing with other folks I learnt something that came to me as a surprise. A lot of people working on managed code consider switch-case to be a stylistic variant of if-else, and that is all.

However, in the C/C++ world switch is not just a variant of if-else (neither is it in .NET), it's a fast (O(1)) variant of if-else (O(n)). Stating that switch is just a better way to express multiple comparison against the same variable is stating Dictionary<T> is just another form of List<T>. They are not (Dictionary can give you O(1) lookup results). 

For example C restricts the case to have constant-expression. This is done so that the compiler can generate optimized jump-table for its execution. Let's consider the following code.

 switch(i) {
    case 4:... 
    case 5:... 
    case 6:... 
    case 7:... 
    // many more cases...
    default:...
}

For this code the machine code generated is similar to the steps below.

  1. Compile time jump table creation: For each case statement a fixed block of memory is reserved. Say 8 bytes. These 8 bytes contain a jump (jmp) instruction to the location where the actual code for the case resides. The base address of this table is labeled as say JMPTABLEBASE.
  2. Normalizes the value of i as i = i - 4 (the first value of the case) 
  3. Boundary check: For the i it sees if the value is larger than the largest case (7-4 = 3), in case it is the execution flows to default.
  4. Jump to address JMPTABLEBASE + (i * 8)

As you can see from above the whole thing happens at constant time.

Some embedded C compilers (like the TI C-compiler) generates separate code section named .switch for the jumptable. Later this section can be targetted to the high-speed internal DARAM for faster execution in case the switch needs such special treatment.

Obviously compilers vary a lot in this manner