Recursively iterate through files in a directory structure

I was playing a little with C#’s new yield keyword and I thought I’d update the code from this article with the “Whidbey” way to do.

Now as I am sure my good friends in perf land will tell you this is likely not the most performant way to write this code, but for the application I am writing it made no difference. From looking at the IL, it looks like the major perf disadvantage is creating the compiler-generated enumerable class that encapsulates the code in the iterator block, not a huge cost, but something to be aware of because more stuff is going on under the covers here.

Basically I am writing an application that loads all the managed DLLs installed with LH and looks through them for certain naming patterns…. The first thing I noticed is the managed DLLs in LH are all grouped under subdirectories of “%windir%\Microsoft.NET\”. Here is the code I wanted to be able to write:

foreach (string assemblyPath in Helpers.GetFiles(path))

{

   Assembly assembly = Assembly.LoadFrom(assemblyPath);

  ...

}

The interesting bit of code I needed to write was the GetFiles() method. It needs to recursively iterate through files in a directory structure. Here is what I came up with.

      public static IEnumerable<string> GetFiles(string path)

      {

            foreach (string s in Directory.GetFiles(path, "*.dll"))

            {

                  yield return s;

            }

            foreach (string s in Directory.GetDirectories(path))

            {

                  foreach (string s1 in GetFiles(s))

                  {

                        yield return s1;

                  }

            }

      }

First we enumerate through all the files in the directory and yield each one out (that turns flow of control back over the foreach statement that called it). Once all the files are done, we start iterating through the directories. For each directory we get all of its files and yield them out in the same way. Notice the recursive call in that 2nd foreach statement.

I do recommend you take a look at the IL for this, it is very interesting. With yield there is a clear separation between what you feed the compiler and the IL that comes out on the back end. I believe this is a good, and necessary language advancement, but it is worth being aware of.

 

update: fixed yield link