“foreach” vs “ForEach”


A number of people have asked me why there is no Microsoft-provided “ForEach” sequence operator extension method. The List<T> class has such a method already of course, but there’s no reason why such a method could not be created as an extension method for all sequences. It’s practically a one-liner:

public static void ForEach<T>(this IEnumerable<T> sequence, Action<T> action)
{
  // argument null checking omitted
  foreach(T item in sequence) action(item);
}

My usual response to “why is feature X not implemented?” is that of course all features are unimplemented until someone designs, implements, tests, documents and ships the feature, and no one has yet spent the money to do so. And yes, though I have famously pointed out that even small features can have large costs, this one really is dead easy, obviously correct, easy to test, and easy to document. Cost is always a factor of course, but the costs for this one really are quite small.

Of course, that cuts the other way too. If it is so cheap and easy, then you can do it yourself if you need it. And really what matters is not the low cost, but rather the net benefit. As we’ll see, I think the benefits are also very small, and therefore the net benefit might in fact be negative. But we can go a bit deeper here. I am philosophically opposed to providing such a method, for two reasons.

The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects. The purpose of an expression is to compute a value, not to cause a side effect. The purpose of a statement is to cause a side effect. The call site of this thing would look an awful lot like an expression (though, admittedly, since the method is void-returning, the expression could only be used in a “statement expression” context.) It does not sit well with me to make the one and only sequence operator that is only useful for its side effects.

The second reason is that doing so adds zero new representational power to the language. Doing this lets you rewrite this perfectly clear code:

foreach(Foo foo in foos){ statement involving foo; }

into this code:

foos.ForEach((Foo foo)=>{ statement involving foo; });

which uses almost exactly the same characters in slightly different order. And yet the second version is harder to understand, harder to debug, and introduces closure semantics, thereby potentially changing object lifetimes in subtle ways.

When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on. Sometimes the benefit added by having two different textual representations for one operation (like query expressions versus their underlying method call form, or + versus String.Concat) is so huge that it’s worth the potential confusion. But the compelling benefit of query expressions is their readability; this new form of “foreach” is certainly no more readable than the “normal” form and is arguably worse.

If you don’t agree with these philosophical objections and find practical value in this pattern, by all means, go ahead and write this trivial one-liner yourself.

UPDATE: The method in question has been removed from the “Metro” version of the base class library.

Comments (45)

  1. Eoin C says:

    Hey Eric,

    Eoin here from Stackoverflow, cheers for posting the article.

    Just going back to my original question though, phrased slightly differently…

    Since the ForEach<T> construct has been implemented for List<T> and Array, does that mean that (from the philosophical point of view) that these DataStructures are considered/treated differently to the IEnumerable/IQueryable Collections in your mind

    Well, the VS Languages team does not have any influence on what goes into List<T>. I personally find the “ForEach” method on List<T> philosophically troubling for all the same reasons that I would find an extension method on IEnumerable<T> troubling. (And the VSL team does control that.) The one mitigating factor is that List<T> is clearly designed to be a mutable, not-side-effect-free data structure, so using expressions that mutate it seems slightly less bad. — Eric

    I suppose my original query stemmed from the fact that is _already_ implemented in two places and not in the others. So perhaps the philosophical debate was had and these two were somehow considered different enough to deserve the out-of-the-box implementation…

    (That said, I have gone ahead and implemented them myself 😉 )

  2. Steve Cooper says:

    We’ve coded this up, and we like it. I think it leads to cleaner code if you often use these long sequences. Eg (off the top of my head);

       // take a code file, removes 
       // comments and blank lines
       // and prints them
       codeContent
        .Split(‘n’)
        .Where(line => !line.StartsWith(‘#’))
        .Where(line => !Regex.IsMatch(“^ *$”, line))
        .ForEach(Console.Writeline);

    I think that because it’s always at the end of one of these sequences, and you don’t assign the result to anything, it’s clear what’s going on.

    It also helps avoid the pattern where you just do this;

       sequence.ToList().ForEach(fnc);

    Which is what you’re tempted to do without this procedure. That would force the evaluation of the entire sequence before any action is committed, and for some longer tasks, this is a significant delay. This makes your app ‘choppier.’

    It can also use up significant memory. I saw this a lot in the ‘oxford comma’ code samples, where people were converting sequences to lists to count them, without concern for the size of the list in memory. A lack of ForEach leads to these kinds of shortcuts.

  3. Keith Patrick says:

    On the other hand, AsParallel().ForEach is extremely useful 🙂

  4. Hussein says:

    Are there any performance differences between foreach and .ForEach?

    Hussein

  5. configurator says:

    Hussein: Of course. In ForEach there is the cost of an extra delegate call for each item. The order is the same – O(n), but the net cost of the ForEach is always higher than that of foreach. That cost is very small though – on almost all cases it would be insignificant compared to the action being called.

    Also, and this is purely hypothetical, foreach can allow for some compiler optimizations that ForEach would not allow, so the difference can be a bit bigger.

    That said, micro optimizations here are useless – the idea of ForEach in my mind is simply wrong for normal scenarios. For the example Steve Cooper gave though, I find that it is usually a much needed ‘Expression’.

  6. pminaev says:

    List<T>.ForEach is actually not the same as foreach-statement internally (you can check that with Reflector). Instead, it’s a plain for-statement over list indices.

    The difference? When you use List<T>.Enumerator (as foreach-statement does), it performs a check to see if collection is modified on every iteration on the loop. A plain indexing for-statement won’t do that, so it is slightly faster.

    In practice, though, the overhead of a virtual call via a delegate is higher than that of version checking.

  7. Ike says:

    This might be the first extension method I wrote. I don’t think the closure piece is important; all extension methods have that. As for side effects, that it’s void should make it pretty clear what it’s doing. In general, if basically everyone wants it, it should just be part of the platform, for the same reason that so many other friendly, generally useful methods are.

  8. Neil Whitaker says:

    Excellent post. As my response here suggests, I agree with you completely on the extension method issue:

    http://stackoverflow.com/questions/317874/existing-linq-extension-method-similar-to-parallel-for/318493#318493

  9. Ben says:

    Nice, I asked this question on Charlie Calvert’s blog a while back.

    In between times I worked out an answer for myself, which was that the ForEach implicitly ‘finalises’ (I don’t know what the correct term is?) the IEnumerable, and so breaks the functional model.

    I have preferred this solution, for logging/debug purposes (the name is clearer too):

           public static IEnumerable<T> SideEffects<T>(this IEnumerable<T> items, Action<T> pPerfomAction)

           {

               foreach (T item in items)

               {

                   pPerfomAction(item);

                   yield return item;

               }

           }

  10. Stefan Wenig says:

    > When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on.

    Of course, the same disadvantage appears when you leave it to everyone to create their own (almost?) identical method. Having trivial extension methods included in the BCL is not about the cost of creating them, it’s all about creating a common language for all .NET developers, something that you can read without looking things up or guessing.

    (Generally speaking – I understand your specfic reasons for not implementing ForEach. But at the same time I follow Steve Cooper here – ForEach might be a nice thing to have in some situations, from a pragmatic PoV. Whether it should be used in a specific situation is a difficult question, but providing it is only a really bad idea if you believe in the concept of protecting developers from themselves.)

  11. Per Erik Stendahl says:

    Ok, but when are you going to implement the pipe as seen in shell scripting and F#? I’m serious, that would be freaking useful! 🙂

    Using extension method chaining is kinda like using the pipe which is one reason many people miss the ForEach() method I guess. I even implemented a Tee() method that does what ForEach() does except it also passes the values on to the next step in the chain.

    Regards

  12. RJ says:

    Yesterday, I converted bunch of code from using foreach to ForEach.

    Here’s what I had originally:

    public class MetaDataFixer

    {

           public List<Mp3File> Mp3FileList;

           private void CleanupTitle()

           {

               foreach(Mp3File file in Mp3FileList)

                   //do clean up of title

           }

           private void CleanupComments()

           {

               foreach(Mp3File file in Mp3FileList)

                   //do clean up of comments

           }

           //Other similar methods:

           private void CleanupAlbum();

           private void CleanupArtist();

           public void Cleanup()

           {

               if(_options.CleanupTitle)

                   CleanupTitle();

               if(_options.CleanupComments)

                   CleanupComments();

           }

    }

    I converted that in to:

    public class MetaDataFixer

    {

           public List<Mp3File> Mp3FileList;

           private readonly Action<Mp3File> cleanupAlbum =

               delegate(Mp3File file) { file.Tag.Album = StringHelper.Cleanup(file.Tag.Album); };

           private readonly Action<Mp3File> cleanupTitle =

               delegate(Mp3File file) { file.Tag.Title = StringHelper.Cleanup(file.Tag.Title); };

           //more clean up methods.

           public void Cleanup()

           {

               List<Action<Mp3File>> actionsToPerform = GetActionsToPerform();

               Mp3FileList.ForEach(delegate(string fileName)

                           {

                               Mp3File file;

                               file = Mp3File.Create(fileName);

                               actionsToPerform.ForEach(delegate(Action<Mp3File> action) { action(file); });

                               file.Save();

                            }

           }

           private List<Action<Mp3File>> GetActionsToPerform()

           {

               List<Action<Mp3File>> actionsToPerform = new List<Action<Mp3File>>();

               if (_options.CleanupAlbum)

                   actionsToPerform.Add(cleanupAlbum);

               if (_options.CleanupArtist)

                   actionsToPerform.Add(cleanupArtist);

               //more options

           }

    }

    I feel that this code is cleaner than the earlier version.

    I understand your point about closure being a potential issue with ForEach but if my delegate code is simple, the ForEach looks cleaner.

    Eric, I would be interested in any comments you may have about this re factoring.

    RJ

  13. Steve Cooper says:

    @RJ

    Create actions that work on a single file list this;

       public static CleanupAlbum(Mp3File file)

       {

         …

       }

    Then you can execute them like

       Mp3FileList.ForEach(CleanupAlbum);

    As an idea, you might also consider an ‘action combiner’ — something like this (not compiled or tested, just for illustration)

       public Action<T> Combine<T>(params Action<T>[] actions)

       {

         return item =>

         {

           foreach(var actOn in actions)

           {

             actOn(item);

           }

         };

       }

    Then you could write;

       allCleanupActions = Combine(CleanupAlbum, CleanupArtist, CleanupComments);

    and

       Mp3FileList.ForEach(allCleanupActions);

  14. Aaron G says:

    I use an extension method like this fairly infrequently, but it’s called "Update" and is used specifically for doing database updates using Linq to SQL, and also returns the number of entities (records) updated as a secondary effect.  It’s a useful method but I think it’s completely fair that I had to write it myself,

    I’m not particularly taken with the "code printing" example a few comments above.  First of all, I think it’s more difficult to read than the vanilla non-Linq version, and secondly the chained "wheres" will cause the entire array to be iterated twice, unless the compiler performs some kind of optimization on those particular extension methods (I doubt it).  A simple foreach with a nested if statement would have been easier to write, easier to read, way easier to debug, and would perform noticeably better for large input.  I know it was an off-the-cuff, contrived example ("off the top of my head"), but it’s nevertheless a perfect example of the inappropriate use of Linq – and having a "ForEach" extension would probably encourage this type of use.

  15. ais says:

    ForEach((x,index) => …) is useful.

  16. wekempf says:

    @Aaron G: No, chained where’s will NOT cause iteration to occur multiple times. Remember that all LINQ operations use delayed evaluation via “yield return”. This means multiple predicate delegates will be called for each item, but you won’t be looping multiple times.

    @Ben: Depending on your definition of “finalizes”, there are other LINQ operations that finalize as well. Average, Aggregate, etc.  ForEach just finalizes with a void result, which is not possible in functional languages.

    Personally, I find the reasons given here for not having Enumerable.ForEach to be lacking.

    And that’s fine with me; you might want to re-read the last line of the post. — Eric

    1. Violation of functional principles: C# isn’t a functional language. Further, even functional languages include looping constructs to deal with impure functional operations, such as input/output (for example, “loop” in Lisp/Scheme). If they can get their hands dirty, it’s crazy not to do so in a non-functional language such as C#.

    2. No representational advantage/duplicate functionality: If you don’t remove the other ForEach methods, then this is a cop out for a personal preference. The cat’s already out of the bag, so at least be consistent.

    Everything stated in this blog post comes down to someone’s preferences, and not a technical argument, IMHO.

    You are correct. That’s why I twice called out that my objections were philosophical and not technical. Apologies if that was in some way unclear. — Eric

  17. wekempf says:

    The problem with implementing myself (which I have done), is that now EVERYONE implements this. Not very DRY.

  18. Greg says:

    Cost vs coolness

    Aaron is correct in that Linq (i.e., lisp like) expressions are much less readable and maintainable.    Lisp-like constructs (LINQ) are extremely powerful but do not scale well for extremly complex systems.  They work in environments where one focuses on business/logic rules for the system and not on writing code such as generating logic tables and autogenerating code from the tables.

    Code should be written with a total cost over the expected product lifetime in mind and not just the speed of which the orginal developer progresses.    This is why simpler constructs, algorithms last longer than higher level, trendy, pattern of the week code.

    The next developer to take over the code should be in your concern when writing or maintaing a system.

  19. commongenius says:

    I have had this debate personally many times. I tend to agree with Eric, that having multiple ways of accomplishing the same thing is inherently counter-productive; even if there are some small gains in being able to call ForEach functionally, they do not overcome that inherent downside. However, I am not sure that having thousands of developers implementing their own nearly-identical-but-subtly-different ForEach method is better in the long run than biting the bullet and creating a standard method in the platform.

    What bothers me most though is that there wasn’t more of an effort made to come to a philosophical agreement when considering the ForEach method on List<T>. IMHO, putting that method on List<T> was the worst possible decision; either it should have been an extension method on IEnumerable<T>, or it should have been left out together. The "different teams" argument doesn’t fly with me; .NET is a platform, and like it or not, its components ARE tightly coupled with each other. Someone needs to have the responsibility of maintaining consistency across the platform, so that things like this don’t happen.

  20. rh says:

    Ike: That is not correct; extension methods have nothing to do with closures.

  21. Puneet says:

    Eric,

    I understand that the extension methods such as Where(…), Select(..) etc do not prevent me from changing state of free variables (or having other side effects). Since you guys do not (or can not) enforce that philosophical underpinning of functional programming, why the objection to inclusion of ForEach due to its “Explicit desire” to have such side-effects.

    Your argument is “you cannot prevent some bad behaviours, therefore you should encourage them” ? — Eric 

    I would contend that the fact that ForEach expressly admits that it wants to change state unlike Where and Select is a good thing and makes its semantics obvious.

     

  22. AlexJ says:

    There is one thing you overlook in your comparison.

    I also overlook that the method version can take a method group as an argument. — Eric

    When you compare ForEach() with foreach, your ForEach() example for some reason includes the type. Which can be implied of course.

    If you let the compiler infer the type you get nice intellisense advantages too. Sure you can get those with a foreach block too. But I think

    “.F Tab ( c => c.”

    is a lot easier to write than

    “foreach (var c in sequence) c.”

    IMHO Intellisense is super important, and shouldn’t be left out of the debate.

    That said it is trivial to add the method. So I just add it whenever I need it. Problem is that I do this in every project because I don’t have a library I share across every project… and after a while all that trivial coding starts to add up.

  23. bongo says:

    I don’t like mixing in IntelliSense in the debate. Sure it’s nice but the language shouldn’t be designed around some external tool. The tool should be designed around the language.

  24. Steve Cooper says:

    @AaronG: "First of all, I think it’s more difficult to read than the vanilla non-Linq version"

    @Greg: "Aaron is correct in that Linq (i.e., lisp like) expressions are much less readable and maintainable."

    Both of these are really style questions. It’s not the style you’d see in older C# programs, but the style is fantastically productive and perfectly efficient. The alternative is complex nesting, which I find much harder to follow. Consider these two pieces of pseudocode;

       take all the lines

       remove comments

       remove blank lines

       print what’s left

    vs

       for every line:

         if it is not a comment:

           if it is not blank:

             print it.

    For me, the simple sequence is easier to get my head around, and that’s what you see with

        items

          .where

          .where

          .foreach

    Nesting makes it harder for me to follow the intent of the code.

    For larger processes this is especially pronounced. Let’s say you have a dozen steps to transform your source items into some target. You have a choice between twelve linear steps using Linq, and twelve nested ‘for’ and ‘if’ blocks without.

    For instance; let’s say you’re writing a code scanner and you start with a directory containing C# source code. What you want out is a list of class names defined in C# files in that directory structure. So you go;

       string[] classNames = rootDir

         .GetDirectories()         // all directories, recursively

         .GetFiles("*.cs")         // all c# files

         .Select(GetFileContent)   // the file content as a string

         .SelectMany(content => content.Split(‘n’)) // all the lines

         .Where(IsClassDefinition) // class definition lines

         .Select(GetClassName)     // class names

         .ToArray();

    The alternative is something like this;

       List<string> classNameList = new List<string>();

       // find all the directories, recursively

       foreach(var dir in GetDirectories(rootDir))

       {

           // find all the C# files in the directory

           foreach(var file in GetFiles(dir, "*.cs"))

           {

               // the file content as a string

               var content = GetFileContent(file);

               // all the lines in the file

               var lines = content.Split(‘n’);

               foreach(var line in lines)

               {

                   // is it a class definition line

                   if (IsClassDefinition(line))

                   {

                       // get the class name

                       var className = GetClassName(line);

                       classNameList.Add(className);

                   }

               }

           }

       }

       string[] classNames = classNameList.ToArray();

    Now, I’ve been programming using the former style for a while now, and I find it significantly easier to understand. Given the choice of a linear, stepwise process, or a large number of nested ‘if’ and ‘for’ blocks, I’ll take the linear steps.

    ———

    @AaronG: "secondly the chained "wheres" will cause the entire array to be iterated twice, unless the compiler performs some kind of optimization on those particular extension methods"

    As wekempf has commented, this isn’t true — the second Where only receives items that passed through the first Where. You get two interations, but the second is on a subset. Iteration is so fast that programming differently for it would be a premature optimisation.

  25. Paul Stancer says:

    Eric, seems your argument is more pedantic than semantic.

    Why have the foreach method at all then? For would suffice, the following code would work perfectly well:

    Perhaps you missed the bit where I said that sometimes the benefit of the improvement in textual representation pays for the cost of having two ways to do something. — Eric

               int[] loop = new[] { 4354, 234, 23, 34, 3, 4 };

               var e = loop.GetEnumerator();

               for (e.Reset(); e.MoveNext(); )

               {

                   Console.WriteLine(e.Current);

               }

    Or we could just use while and get rid of For as well.

    Of cource the reason we use foreach is that it is easier.

    The same goes for the Linq version of ForEach. It makes code simpler and easier to read.

    loop.ForEach(i => Console.WriteLine(i));

    I use this all the time, as I find it very useful, it seem more natural when working on a collection, as I don’t need to output a collection and then run a foreach on it.

    Great. Perhaps you missed the bit at the end where I said that if you find this useful and don’t share my objections, you should go ahead and use it. — Eric

    Your side-effect argument is also a little weak, can you explain how ToArray() is side-effect free in that case?

    No, because I don’t understand the question. — Eric

    I hae a direct link to the iterators here if any one wants a full example: http://code.google.com/p/ximura/source/browse/trunk/Ximura%20Base/Helper/Linq/Linq.cs

    There are ForEach, ForIndex, and ForBigIndex iterators.

    Paul Stancer

  26. wekempf says:

    @Steve Cooper: "You get two interations (sic), but the second is on a subset. Iteration is so fast that programming differently for it would be a premature optimisation."

    No, there’s a single iteration. The first Where looks at the first item.  If it’s accepted, "yield return" creates a continuation point and immediately passes control to the second Where. The second Where then makes the same decision, and eventually returns to the continuation point in the first Where, where the second item is processed.  This means we "loop" only once, and pass the results up through the chain of operations, through the use of continuations created by "yield return".  This construct is highly efficient… at least as efficient as you’d code imperatively.

  27. One way to avoid the philosophical objection is to only manipulate sequences of Actions. For example, we can already do this:

       var purge = new DirectoryInfo("c:\stuff")

                   .GetFiles()

                   .Where(file => (file.Attributes == FileAttributes.Normal))

                   .Select(file => new Action(file.Delete));

    The "purge" is now a list of actions to be executed later. And I can tack on a few actions to send notification emails after the purge:

       string[] recipients = new[] {"fred@nowhere.com", "sid@somewhere.com"};

       purge = purge.Concat(

           recipients.Select(recipient =>

               new Action(() => new SmtpClient().Send("do-not-reply", recipient, "Hey!", "Finished purge"))));

    The only thing we need a new extension method for is this:

       Action doPurge = purge.Aggregate();

    Which can be implemented thusly:

       public static Action Aggregate(this IEnumerable<Action> source)

       {

           Action combined = null;

           foreach (Action action in source)

               combined += action;

           return combined;

       }

    Just turns the sequence of Actions into a single multi-cast delegate.

    Note that up to this point, there have been no side-effects whatsoever. We’ve only been manipulating lists of actions, not actually executing them. So nothing we’ve done can possibly invalidate that philosophical objection. 🙂

    The last thing we do is:

       doPurge();

    Which is where all the side-effects happen at last, and who could possibly be surprised by side-effects from a statement like that?

  28. Greg says:

    Steve, code to print all code lines neglects any error handling.  Try to effectivly determine what error happened and where it happened with your code example is difficult and greatly increases the cost to support and maintain in an ongoing basis.

    Your example:

    string[] classNames = rootDir

        .GetDirectories()         // all directories, recursively

        .GetFiles("*.cs")         // all c# files

        .Select(GetFileContent)   // the file content as a string

        .SelectMany(content => content.Split(‘n’)) // all the lines

        .Where(IsClassDefinition) // class definition lines

        .Select(GetClassName)     // class names

        .ToArray();

    Errors/exceptions that may happen:

       – permission error at any one of

               – directory

               – file

       – Memory errors (file too big, too many lines)

       – File content errors (binary file with .CS extension)

       – Generic .net errors

    A generic exception.tostring() is not useful for supporting a production quality commercial product.

    This is what is lost in making calls a.b(w).c(x).d(y).e(z) since any particular call in the chain may fail or (much worse) induce side effects that affect later calls in the chain.  It was used early on in C++ systems but fell out of favor for that reason.

    The sequential code is much easier to test and maintain if you factor out into a new function the code to open a CS file and print its contents.

    For compactness, you could even do the same thing in the cmd shell without having to ship a C# application (using FOR, findstr, grep and if adventurous (SED)).

  29. @Greg – the example you quote from Steve does not have side effects. It only performs "read" operations and makes copies, so it is perfectly safe for it to only partially execute.

    Regarding exceptions, if an exception is thrown at any stage, that exception will capture a call stack that can be logged, and which precisely pinpoints where the error occurred; if customised handling or recovered is needed (unlikely, given that a partial result would be of no use) then the exception object can be queried for information.

    What would be gained by breaking it up into separate statements? Nothing, from an exception handling perspective, unless you wrapped each step in a separate try/catch. And even then, if every catch simply logged error information and aborted the whole operation, then nothing at all would be gained but readability would be shot to pieces.

    Finally, your suggestion of a pipeline of Unix tools would be logically (and in terms of maintenance and robustness) no different from what Steve’s example does within C#, except it wouldn’t be statically typed. He has simply constructed a similar pipeline of operations on statically typed objects.

  30. Aaron G says:

    @Steve:  I consider your new example a false dilemma, because the "sequential" version should be a 4-line method broken out into subroutines/methods.  As others have pointed out, the Linq-ified version is also impossible to debug in spite of being easier to read.  But even ignoring those two issues, how does it justify a "ForEach" extension method instead of a regular foreach loop?  If ForEach returns void (otherwise it would be a Select), then it would always have to be the last statement in a Linq construct, so there’s no way for it to eliminate nested loops or conditionals or other such verbosity.  At most, it would eliminate a little bit of whitespace and possibly a pair of braces.

    I’m not arguing against Linq here – I’ve been using it since the first CTPs – I just think that too many people are using it because it’s slightly easier for them to write, or just because they can, without paying any heed to testing or maintenance.  There are times when a functional style is appropriate, and there are times when object-oriented or straight procedural styles are more appropriate.

  31. Puneet says:

    Your argument is "you cannot prevent some bad behaviours, therefore you should encourage them" ? — Eric

    First , I think that describing the use of ForEach as "bad behavior" to perform actions on a sequence is not entirely legit and can not be put in the same box as Select() And Where(). That is because the intent of the latter constructs is different and expressed so by their nomenclature. You yourself contended that ForEach by its very nature would be used to have side effects. I am merely stating that since the semantics and the intention of ForEach w.r.t being side-effect-full is explicit, the convenience and nicety (both subjective) of writing:

    seq.ForEach( (s) => s.Prop = val ) ;

    trumps writing:

    foreach(var s in seq )

    {

    s.Prop = val;

    }

  32. Russ McClelland says:

    On the other hand, the goal of object orient programming is to keep data and behavior together.  Which class has the data?  Which data?  The collection elements, so the collection object has the data.  Then the collection object should be responsible for iterating over the collection of elements.  Why should the consumer of the collection know how to iterate over each element?  

    Instead, providing iteration method on the collections leads to a cleaner distribution of responsibilities:  The consumer knows what they want to do with each element, so they provide the action; the list knows about the elements so it does the work.

    Having this kind of mindset elements unnecessary language constructs like "for" and "foreach" and helps to reinforce good encapsulation.

  33. Greg says:

    @Daniel – You cannot easily query the exception object to provide a meaningful message to the end user.  Exception and error handling should a) provide meaningful message to the one reading the message be they an end user or internal support person.  An exception giving the method, line number and generic error message (‘null reference exception’) does not convey that file d:dataABC.TXT is not acessible by the application due to permission issues.

    A generic exception.tostring() is not useful for supporting a production quality commercial product.  I’ve seen this in multiple different offshore and onshore written applications (asp.net, c#, vb.net).

    @Puneet – Your state that the foreach is self documentating ("the semantics and the intention of ForEach w.r.t being side-effect-full is explicit").  Self documenting code has been an ongoing and unresolved argument within computer software development since the early 1970s.  Self documenting code advocates often ignore the aspect that code with significant business logic documented in code comments is much easier and much lest costly to develop and maintain.

    General: Linq like constructs work well when you have a high enough level development tool to develop the system using process workflows.  Tools that let business analysts design the workflow,  develop business rules for it and generate code or state tables for developers work best.   They promote maintainablity at the business level first and at the lower level code secondly.

  34. Puneet says:

    @Greg

    I don’t think my comments are geared towards making the case for ForEach() because it is "self-documenting". I usually hear that term in the context of developers NOT wanting to write documentation for their methods:) and so it has a negative connotation for me.

    My argument is based purely on the expressed intent of ForEach, communicated via its naming AND its documented behavior. This behavior is exactly what one would expect if they did sequence.ForEach(..) and so there are no "unintended side effects" vis-a-vis doing foreach(…) { }…. just ovelooked ones.

  35. runefs says:

    I don’t mind there’s no foreach extension but in a lot of the situations where I end up using foreach is for applying a function on all the elements to create a result, (just like sum or average in effect does). An extension method for applying any function to all elements would be nice.

  36. Steve Cooper says:

    @Greg:

    @Aaron:

    Thanks for the interesting discussion, guys.

    "Steve, code to print all code lines neglects any error handling … Try to effectivly determine what error happened and where it happened with your code example is difficult"

    Error handling can happily exist wherever you like. Here’s the code again, for reference;

       01 string[] classNames = rootDir

       02    .GetDirectories()         // all directories, recursively

       03    .GetFiles("*.cs")         // all c# files

       04    .Select(GetFileContent)   // the file content as a string

       05    .SelectMany(content => content.Split(‘n’)) // all the lines

       06    .Where(IsClassDefinition) // class definition lines

       07    .Select(GetClassName)     // class names

       08    .ToArray();

    Let’s take the GetFileContent() method called on line 04. I can quite happliy wrap that code in any error handling I like;

       string GetFileContent(string path)

       {

           try

           {

               ….

           }

           catch

           {

               // your error handling here.

           }

       }

    So that we don’t lose any expressive power in terms of adding error handling code.

    Secondly, if there is an unhandled exception in that function, your stack trace reflects it, with something like;

       FileNotFound at

         GetFileContent in

         PrintClassNames …

    Debugging is a little harder, but only really because of a generic debugging problem in visual studio. There’s no way I can debug the return value of a function. So if I write;

       string GetFileContent(string path)

       {

           return System.IO.File.ReadAllText(path);

       }

    then there’s no way to debug that result. That’s annoying, and it’s more pronounced when you write a lot of small functions like this, but it’s more a problem with visual studio’s debugging facilities. You can still put a breakpoint in the code happily; so if you want to debug the call to IsClassDefinition() on line 06, you can put a breakpoint right there and debug away.

    "I consider your new example a false dilemma, because the "sequential" version should be a 4-line method broken out into subroutines/methods."

    My version is *already* broken out into subroutines; the methods GetDirectories(), GetFiles(), GetFileContent(), IsClassDefinition(), GetClassName() are all standard C# functions. But the actual gluing together of these functions appears as a linear sequence of steps which filter, convert, and correlate data together. All I’ve done is glue old-skool functions together using standard forms of glue.

    I suppose this is the important thing, from my POV; we have real custom work we need doing (getting file content, parsing files) and then we have glue. foreach(in) is a type of glue. So is Select(). What I’ve found is that I often end up making glue and reusing it in different parts of my app. Almost all this glue code deals with collections of things; I make custom filters, iterators, and conversions.

    So the question arises, really — let’s say you discover some great new kind of glue. For instance, you realise that this code is exactly the same thing as a SQL CROSS JOIN;

       foreach(var a in listOfA)

       {

           foreach(var b in listOfB)

           {

               if (a.x == b.y)

               {

                   // do something involving A,B

               }

           }

       }

    Let’s say this pattern appears fifteen times in different parts of your app. How do you avoid repeating yourself? How do you avoid the copying and pasting of this kind of structural or glue code?

    Myself, I’d write a CrossJoin() method with this signature;

       public static IEnumerable<Pair<T1, T2>> CrossJoin<T1, T2>(this IEnumerable<T1> seq1, IEnumerable<T2> seq2)

    And then call it like this;

       a.CrossJoin(b).Where( (a,b) => a.x == b.y );

    To my mind, I’ve introduced a useful abstraction (cross join) and been able to name it, call it throughout the application, and remove duplication.

    So my question to you guys, and anyone else; let’s say you’d discovered this pattern that repeats over and over. How do you pull it out and reuse it without using a linq-like syntax?

  37. Многие люди спрашивают меня – почему Microsoft не предоставила для последовательностей метод расширения

  38. Alg&uacute;n tiempo despues de que diera mi opini&oacute;n sobre un potencial m&eacute;todo extensor

  39. If you don’t read Eric Lippert’s Fabulous Adventures In Coding , you’re really missing out on some great

  40. If you don&#39;t read Eric Lippert&#39;s Fabulous Adventures In Coding , you&#39;re really missing out

  41. Omer Mor says:

    Well, the Reactive Extensions team at Microsoft Research chose differently.

    They included 2 imperative operators in their extensions: Run (same as your ForEach), and Do.

    Check it out here: http://community.bartdesmet.net/blogs/bart/archive/2009/12/26/more-linq-with-system-interactive-the-ultimate-imperative.aspx

     Omer.

  42. Paul Betts says:

    Great blog – I generally agree with the post, ForEach is very imperative. One of the only redeeming qualities of it though, is that it means that when I'm writing mostly functional code, the order is *consistent*: every time I have to use foreach it feels "out of place" compared to the order I've been writing all my other statements in like Select and Zip (as the above poster mentions, I use Rx's "Run" instead). Of course, you could argue that it *should* feel weird, since foreach is bad 🙂

    The other advantage is that you can use Method Groups, so for example:

    new[] {1,2,3,4,5}.Select(x => x*5).Run(Console.WriteLine)

  43. Mark Rendle says:

    The Reactive Extensions library provides the Run extension method, which I think is a better name.

  44. Akash Kava says:

    This probably isnt directly related to answer but its pretty funny of what I have just found out.

    ForEach, delete (works)

    List<int> list = new List<int>(){ 1, 2, 3, 4, 5, 6};

    list.ForEach(x => {

       Console.WriteLine(x);

       list.Remove(x);

    });

    foreach, delete (crashes)

    // throws exception

    foreach (var x in list)

    {

       Console.WriteLine(x);

       list.Remove(x);

    }

    ForEach, insert (…)

    // goes in infinite loop…

    list.ForEach(x => {

       list.Add(1);

    });

    foreach, insert (crashes)

    // throws exception

    foreach (var x in list)

    {

       Console.WriteLine(x);

       list.Add(x);

    }

    So anyone who is talking here about mutability or different confusing layers etc, I think it is completely half implemented feature by Visual Team because enumeration will always cause problems if the collection is modified.

    Despite of arguments, I still see a no reason that ForEach should allow modifications, it is purely used for enumeration and it makes no difference whether its foreach(var item in x) syntax or x.ForEach(x=>{}) syntax.

    I disagree with Eric, I just see it simply that BCL team has implemented this feature on IEnumerable as well as list.ForEach is faulty.

    "Why" is completely subjective,for example, Silverlight has added complex hashing algorithms and left MD5 behind where else everywhere else we use MD5 so widely. Its more of how much anything is in demand and who is the one who chooses whether to include it or not in framework.

    There is no logical or philosophical reason at all for not having ForEach in IEnumerable. There are many such missing points which I think .NET will improve over time.

  45. Mmx says:

    Also the List<T>.ForEach method is *uselessly* slower than the foreach(…) equivalent (since under the hood it calls a delegate as pure overhead).

    The key is the "uselessly" word. If there was *any* gain (like in readability, conveying semantics, whatever), worrying about it would be premature optimization.

    But we have code which is already harder to read, semantically misleading, causing side-effects… and it also is slower!

    What worries me is that people are calling this a functional approach and praising it for being easier parallelize (when it's not and should not be parallelized) and for being easier to optimize (when in reality it's just slower).