Foreach, Garbage, and the CLR Profiler

 I've been doing a lot of thinking lately about foreach, and in what situations it may or may not create garbage. There have been some posts on the forums asking about it, and I recently found out that for some types, the enumerator that is created is a value type, not a reference type. So, foreach loops over those types shouldn't make garbage at all. Why does this matter? In an XNA Framework forum thread, Stephen Strychak, an XNA Framework Developer, explains:

"...the performance of your game will degrade if you allow garbage to be generated during gameplay. The problem again is not the for or foreach. The problem is that the garbage collector on the Xbox 360 is less efficient than the one on Windows, and doing a collection can take a long time. When you have many objects in your game, it can take longer than 1/60th of a second to do a collection. That means that you will drop frames. The more garbage you generate, the more frequently the GC will have to do a collection, and the more frequently your game will drop frames.

So the bottom line is that you should avoid generating garbage, and using for instead of foreach is one way to do that."

Shawn posted this in a follow up, which is an important point:

"...bear in mind that just because foreach creates garbage, this does not neccessarily mean you should avoid using it! I think people often get too hung up on optimisation. Fast code is good, but readable code is more important. If your program is already running fast enough, it's sometimes ok to do things the easy way, even if you know that is less efficient than it could be..."

I agree with Shawn: when I write code, I want it to be maintainable, easy to read, and easy to modify. But still, it's good to know what things may have a potential speed impact. And plus, since I'm a nerd, I wanted to look into this: the whole situation seemed like a lot of special cases and conjecture: you will make garbage if you go over this collection, but you won't if you go over that collection (but only if x is true). I wanted to get the whole story. Plus, this was an ideal reason to learn to use the Windows and Xbox 360 CLR profiling tools, which I've put off doing for some time. So, I figured I'd do just that: do a foreach loop over a bunch of collections, use the profilers to see if I'm making garbage, and hopefully we'd all learn something in the process.

I started with Windows, so the first thing I needed was the Windows CLR Profiler. I needed to learn how to use it, too. It comes with a pretty hefty document explaining it, but I found this link on msdn tv, where a friendly man gave me an overview. It's about half an hour long, and was worth watching, I think.

After watching the video, I felt fairly confident that I would be able to get the information I needed. It was time to start mucking around, to see what I can find out. In C# express, I created a new console app, and filled it in with what is quite possibly the world's dumbest code ever:

 class Program
{
    class GameEntity
    {
        public void Update()
        {
        }
    }

    static GameEntity[] entities = new GameEntity[100];
    static Program()
    {
        for (int i = 0; i < entities.Length; i++)
        {
            entities[i] = new GameEntity();
        }
    }
    
    static void Main(string[] args)
    {
        byte[] byteArray = new byte[1];
        for (int i = 0; i < entities.Length; i++)
        {
            entities[i].Update();
        }
    }
}

I wanted to do this for a quick sanity check. If I had any idea what I was doing, when I use the CLR profiler on this app, I should see Main making one allocation, the 1 byte array. So I built it, and ran the CLR Profiler on it. Once it was done, I pulled up the allocation graph using by clicking this button:

 

My first look at the allocation graph was fairly terrifying. It's full of colored lines and boxes for methods I'm fairly sure I didn't write.

 

After fiddling for a minute, it turned out that as always, right clicking is the answer. I did "Find routine" and looked for Main, which gives me this view. Much better.

I've got no idea why my 1 byte array actually took up 13 bytes, but I'm not particularly worried about it. (I'm curious to know, however, so if anyone out there in readerland has any ideas, send them my way.)

 

As my next sanity check, I changed my program from an array to a Collection<GameEntity>, and changed Main to do a simple foreach over the Collection:

 

 static void Main(string[] args)
{
    byte[] byteArray = new byte[1];
    foreach (GameEntity e in entities)
    {
        e.Update();
    }
}

Under the hood, that should create an enumerator on the managed heap, which should show up in the profiler. So, following the exact same steps in the CLR profiler, I get this display:

There's that pesky enumerator right there, just as we expected!

Now, here comes the tricky part. Rumor has it that many of the types in System.Collections.Generic will not allocate an enumerator when using foreach. List's GetEnumerator, for example, returns a struct, which will just sit on the stack. Look for yourself with .NET Reflector, if you don't believe me. To prove to myself that a foreach over a List doesn't cause any heap allocations, I changed entities to be a List, did the exact same foreach loop, and ran the profiler. No enumerator! I tested a few other types as well, LinkedList and Queue, with the same result. Foreach loops over most of the types in System.Collections.Generic do not generate garbage. The most prominent exception is Collection. Collection<T> cannot declare its enumerator type as a struct, because Collection<T> is designed to be easily overridden.

However, there is definitely a caveat to the above. Foreach loops over Lists can still generate garbage. Take this code, for example:

 const int NumEntities = 100;
static List<GameEntity> list = new List<GameEntity>();

static Program()
{
    for (int i = 0; i < NumEntities; i++)
    {
        list.Add(new GameEntity());
    }
}

static void Main(string[] args)
{
    UpdateIEnumerable(list);
}

private static void UpdateIEnumerable(IEnumerable<GameEntity> enumerable)
{
    foreach (GameEntity e in enumerable)
    {
        e.Update();
    }
}

this will create garbage. Even though we're still doing a foreach over a List, when the list is cast to an interface, the value type enumerator must be boxed, and placed on the heap.

So, as far as I know, here's the final word:

When doing a foreach over a Collection<T> an enumerator WILL be allocated.
When doing a foreach over most other collections, including as arrays, lists, queues, linked lists, and others:
    if the collections are used explicitly, an enumerator will NOT be allocated.
    if they are cast to interfaces, an enumerator WILL be allocated.

Hope this helps some people, and maybe you're a little less terrified of the CLR profiler. Next time, I'd like to do the same thing for the .NET CF on the Xbox 360, or maybe figure out how to use the CLR profiler to track down the methods that are doing the most allocations. I have a pretty terrible track record with "Next times", though, so don't keep your fingers crossed.