Efficiency of iteration over arrays?


I got the following question on email:


Question:


Which one of these three loops is the most efficient? How can I prove the answer?


I’ve listed the source code below. The three loops are:




  1. Foreach over an int array


  2. Simple for over an int array


  3. For over an int array, hoisting out the length value

On questions such as these, there are two ways to find the answer. The first is to build the code and look at the generated IL (or the resulting assembly, remembering that you get debug code if you run in the debugger. Attach to a running process, and then you get the optimized native code), or to time the code with a stopwatch.


In this case, I chose the IL approach, because it was quicker.  For arrays, versions #1 and #2 produce similar IL, and should produce very similar execution speed (I’m too lazy to actually time them today). So, I would choose the foreach version unless I needed the index.


In fact, I would advocate this position even if for loops are faster, to avoid the sin of premature optimization. It remains true that only 10% of the code needs optimization, and that algorithmic efficiency is often the driving factor.


The third option is to be avoided. The JIT looks for the pattern in version #2, and knows how to optimize it. If you pull the value out into a temporary, it may not optimize it. Of course, you know that because you’ve been measuring your important scenarios…


Code 

int [] foo = new int[100];
// version 1:
foreach ( int i in foo)
  Console.WriteLine( i.ToString( ));
 
// Version 2:
for ( int index = 0;
  index < foo.Length;
  index++ )
  Console.WriteLine( foo[index].ToString( ));
 
// Version 3:
int len = foo.Length;
for ( int index = 0;
  index < len; 
  index++ )
  Console.WriteLine( foo[index].ToString( ));

Comments (25)

  1. Would I be right in thinking that this only applies to arrays? I tried this on an ArrayList and got this from Reflector’s IL decompilation:

    ArrayList list1;

    string text1;

    IEnumerator enumerator1;

    IDisposable disposable1;

    list1 = new ArrayList();

    enumerator1 = list1.GetEnumerator();

    try

    {

    while (enumerator1.MoveNext())

    {

    text1 = ((string) enumerator1.Current);

    Console.WriteLine(text1);

    }

    }

    finally

    {

    disposable1 = (enumerator1 as IDisposable);

    if (disposable1 != null)

    {

    disposable1.Dispose();

    }

    Trying timing tests on this comapared to the simple for it does seem to be significantly slower looping over large arraylists.

  2. There is a somewhat detailed analysis of this on my blog. Eric, if you could be so kind as to point out any inaccuracies it would be very nice.

    http://weblogs.asp.net/justin_rogers/archive/2004/03/26/97230.aspx

    I realized just now I never posted any actual numbers and instead left the test code at the end for people to test if they wanted to, but that should be okay.

  3. Lavos says:

    In the last 4 weeks, in nearly every instance where I started with a foreach, I ended up switching to a for loop because I needed the index for one reason or another.

    Of course, I was mainly doing code-gen stuff and was trying to get my commas right, but I see that trend holding true at least part of the time.

    I’m beginning to just default to for’s for at least codegen to avoid constantly refactoring the loop. 🙁

    If IEnumerator/IEnumerable supported something like STL’s compare to the end/start properties so I can check if the enumeratorion was done, I’d be a happy camper.

  4. Eric Gunnerson has an interesting post about the efficiency of iterating over an array using different methods. The three methods listed are: foreach over an int array Simple fore over an int array forover an int array, hoisting out the…

  5. Joe Duffy says:

    Interesting. I suppose this is a holdover from C/C++, but I always choose a variation of #3 for loop iteration (should I need the current index during the loop, that is)…

    for (int i = 0, c = foo.Length; i < c; i++)

    Console.WriteLine(foo[i].ToString());

    Why would this impact post-JIT performance, though — how much more optimization can take place? c is already a stack-based variable and should probably end up as a register due to the frequent jump test (i < c), etc.

  6. B.Y. says:

    I think it’s going too far if "Simple for over an int array" is considered an optimization.

  7. Joe Duffy says:

    FYI, I did some timing tests myself, and was surprised at how large of a difference the "hand optimization" makes… And, not a good difference, either! 😉

    The wonders of modern-day JIT optimization.

    http://www.bluebytesoftware.com/blog/PermaLink.aspx?guid=16a1cf7f-73f6-45a9-9aff-e88264b13b0e

  8. One statement that I have heard incessantly is that foreach over an array is slower than for (int i = 0; i &lt; array.Length; i++). Finally, some good people have debunked this myth: Kevin Ransom — To foreach or not…

  9. Joshua Allen says:

    Eric, I did all three of these in ildasm, and the third option was two instructions shorter (in the loop) than the second, which in turn was two instructions shorter than the first. People love to memorize "rules of thumb" and apply them, but I really think this whole swarm of threads is doing the blogsphere a disservice. Not only are your numbers bad; so are Joe Duffy’s and Kevin Ransom’s. And people are now running around telling each other "best practices" based off of numbers that are flawed. Besides, there are differences in semantics of all three techniques, and these differences could very well be important. Hoisting the length is not just an attempt at optimization, for example. Finally, telling someone "don’t do loop ‘x’ because the JIT will not understand it" is just as bad as telling them to "hand optimize ‘foo’" — both are bad, bad, bad

  10. <p>In his blog, <span style="font-style: italic;">Better Living Through Software</span>, Joshua Allen looks at three different ways to loop in C# from an optimizing-for-performance point of view:</p><p><code>foreach (int i in foo) {}<br>

    <br>

    for (int i = 0; i &lt; foo.Length; i++) {}<br>

    <br>

    int len = foo.Length; for (int i = 0; …

  11. chrisbro says:

    #3 can be a huge massive performance gain, depending on the complexity of the .Length operator. For instance, in some collections (Hello, XmlNodeList) calling .Length actually recomputers the length of the array with every pass. If I’m doing work that I explicitly know will not alter it, then I should move .Length into a temporary variable and get a performance gain.

    For instance, I had an XML document with 2800 items. for(x=0; x<xmlnl.Length;x++){} took something like 900 million function calls to evaluate, because of the .Length.

  12. Moi says:

    So we are supposed to avoid certain códe constructs (in this case we are supposed to write 2 instead of 3) because the compiler sucks? Gimme a break.

  13. Dan says:

    so true Moi. Hopefully we will see some maturity with MS’s compiler… but I doubt that happens for a while. Hats off to them for the .NET framework though.

  14. MBA says:

    Helpful For MBA Fans.

Skip to main content