Need for speed:Loop performance

For those guys who havent heard about Loop unrolling here is a brief description about it. Loop unrolling is a way of optimizing the code by manipulating the iteration variable in the body of the loop thus reducing the number of checks each time the loop is iterated. Consider this following code.

           

      for(int i=0;i<100;i++)

      {

               myfunc();

      }

If you try to convert this code to assembly language you would see around half of the instructions were due to the loop overhead(i.e for  conditional branches and jumps). Hence to optimize this, I would change it to

 

      for(int i=0;i<100;)

     {

                myfunc();i++;

                myfunc();i++;

                myfunc();i++;

                myfunc();i++;

      }

This reduces the number of conditional branches and loops to 1/5th and hence reduces the overhead. Infact there are some compilers that perform loop unrolling. The Intel C/C++ compiler for linux gives you an option to specify at compile time the number of times the loop has to be unrolled. However the amount of loop unrolling and operations inside the loop body has to be carefully chosen. More unrolling may lead to increase in the register usage to store temporary variables.

 

You can try out timing the above piece of code to find out the difference using the technique specified in my earlier article