SYSK 390: String.Format vs. concatenation

Since .NET 1.0, we’ve all been told – strings are immutable, so for best performance, avoid large number of string concatenation and use StringBuilder.  Ok, but how large is “large” and what about string.Format?
If you use your favorite .NET de-compiler, you’ll see that string.Format does quite a bit of work, and it would be logical to expect to pay, in terms of resource utilization, for that work.  And, how bad is really string concatenation – for example, if I just need to put together an address from its parts into one string, should I use StringBuilder?
Here are some numbers from some tests on my laptop:
- Doing string concatenation (e.g. result +=x) doesn’t appear to be a problem until I deal with over 10,000 strings.  Then the performance gets significantly worse:

Number of
concatenations Time taken in ms
-------------------------------------------
2,000 2
5,000 12
10,000 46
100,000 14,631

- Getting data elements from an array results in about the same performance in small numbers (up to 10,000), but at 100,000, the performance is almost 70% slower

- Doing 10,000 concatenations with string.Format with a format string taking 1,000 elements at a time (adding a loop to execute logic 10 time, so we end up with 10,000 concatenations) takes 18 ms (compare to 46 when doing appending one string at a time via traditional += type of concatenation)

- When Format string is only 100 elements and looping 100 times, the time taken for 10,000 concatenations drops down to under 3 ms.

The performance numbers change slightly, but the ratio stays about the same when I add GC.Collect call.
Ok, so that’s for large number of strings…  What about just a few concatenations, like putting together an address from its parts?
Running Concat function vs. Format function below 10,000 times resulted in roughly the same performance – about 3-4 ms total time taken.   Increasing the loop count to 10,000,000 resulted in 2,831 for 6 string concatenated via Concat function vs. 3,380 ms for same number of concatenations via Format.
Interestingly, when concatenating only two or three strings (e.g. person’s first and last name), the concatenation took almost 3 times less time than using the Format function -- 546 ms for 10,000,000 concatenations of 2 strings with Concat vs. 1501 via Format.
private static string Concat(string s1, string s2, string s3, string s4, string s5, string s6)
{
    return s1 + "\r\n" + s2 + "\r\n" +s3 + "\r\n" + s4 + ", " + s5 + " " + s6;
}
private static string Format(string s1, string s2, string s3, string s4, string s5, string s6)
{
    return string.Format("{0}\r\n{1}\r\n{2}\r\n{3}, {4} {5}", s1, s2, s3, s4, s5, s6);
}

So, what’s the significance of all of this?   In my opinion, when putting together small number of strings, e.g. person’s name, address, etc., even if you’re dealing with dozens and a few hundred strings, from performance point of view, the difference is negligible.  I used to favor string.Format because of the type conversion, but it you have a mismatch between the number of arguments in the format string’s and those passed in, you’ll end up with a runtime error. Nowadays, the framework properly handle concatenating a number, a string and a null, e.g.
int number = 5;
string nullString = null;
string result = number + nullString + "abc";

will give you "5abc".   So, string concatenations (in reasonable numbers) don’t appear to be such an evil…  However, with large number of concatenations you should evaluate your case and consider whether the performance hit, as well as the CPU utilization, warrant a different way of handing your data.