StringBuilder performance issues in .NET 4.0 and 4.5

I have recently discovered interesting behavior change of StringBuilder class introduced in .NET framework 4.0 (and remaining in 4.5). There is also a easy workaround once you know the behavior.

Sample code exposing the problem

Let’s use the following test code:

         static void TestStringBuilder(bool reuse)
        {
            StringBuilder sb = new StringBuilder();
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();
            for (int i = 0; i < 16384; i++)
            {
                for (uint j = 0; j < 32; j++)
                {
                    sb.Append("a string");
                }

                string dummy = sb.ToString();

                if (reuse)
                {
                    sb.Length = 0; //sb.Clear() behaves same way
                }
                else
                {
                    sb = new StringBuilder();
                }
            }
            stopWatch.Stop();
            Console.WriteLine("Total time: {0}", stopWatch.Elapsed);
        }

        static void TestStringBuilder2(bool reuse)
        {
            StringBuilder sb = new StringBuilder();
            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Start();
            for (int i = 0; i < 16384; i++)
            {
                for (uint j = 0; j < 32; j++)
                {
                    sb.Insert(0, "a string");
                }

                string dummy = sb.ToString();

                if (reuse)
                {
                    sb.Length = 0; //sb.Clear() behaves same way
                }
                else
                {
                    sb = new StringBuilder();
                }
            }
            stopWatch.Stop();
            Console.WriteLine("Total time: {0}", stopWatch.Elapsed);
        }


        static void Main(string[] args)
        {
            TestStringBuilder(true);
            TestStringBuilder(false);
            TestStringBuilder2(true);
            TestStringBuilder2(false);
        }

Now let’s try to build the code while targeting .NET 3.5 (or older) and then .NET 4.0 (or newer) and see what happens when we run it.

Output for .NET framework 3.5 (and below):

 Total time: 00:00:00.0251184
Total time: 00:00:00.0294951
Total time: 00:00:00.0477691
Total time: 00:00:00.0535298

Output for .NET framework 4.0 (and higher):

 Total time: 00:00:00.0167945
Total time: 00:00:00.0281705
Total time: 00:00:11.1285809
Total time: 00:00:00.0849541

Profiling results

Once you connect some .NET performance profiler (I used build-in performance profiler in VS 2012), you’ll notice that the vast majority of the time is spent inside the StringBuilder.set_Length property:

 

SbCodeProfiled

By using some .NET decompiler tool you’d find out that using StringBuilder.Clear() wouldn’t make any difference as this method just calls the setter of the Length property (so is effectively identical to the code we wrote).

Where is all that time spent?

That should always be our next question when finding similar behavior. We have few options of how to find this out:

  • Run performance profiler in sampling mode (to gain more extensive data) while pointing it to Microsoft symbols server in order to get detailed stack information
  • Inspect the code – by decompiling it by some .NET decompiler tool, or by pointing debugger to Microsoft symbol server and turning on source server usage.
  • Indirectly profile (AKA trace - meaning in sampling mode) the process and collect various metrics (especially .NET performance and memory ones).

By first two approaches we would find out that StringBuilder.set_Length code that is being executed is actually allocating a memory (and it seems to be mainly allocating memory on Large Objects Heap). By the third approach we would find the same indirectly. Let’s see how extensively is this code using memory and causing garbage collection:

EditedPerfMon_40

We can see that the offending code is spending enormous amount of time inside the Garbage collection and that almost all memory being allocated is actually being allocated at Large Objects Heap. If we would compare this with behavior in .NET 3.5:

EditedPerfMon_35

We would see more expectable results – reusing temporary variables between multiple iterations of small cycles causes garbage collection to do more work (as those local variables get promoted to higher generations and as a result garbage collection needs to collect and compact more generations in order to obtain enough memory), however it’s not so prohibitively expensive to do so.

Why is this happening?

If you would inspect code of StringBuilder in .NET framework 3.5 or earlier and compare it to StringBuilder implementation in .NET framework 4.0 or newer you would find out that previously the StringBuilder internal memory storage was implemented as one big array (this was done to optimize ToString operation which basically just gave the ‘pointer’ to the internal buffer), however this was changed to internally use linked list (in order to optimize appending/inserting operations).

The set_Length implementation then in case of shortening the StringBuilder tries to allocate chunk of memory to perform some consolidation of some nodes that should be left in string builder after shortening. However it fails to recognize that we’re trying to completely erase the internal buffer (as the Insert operation prepends the nodes to the list rather then appends – and set_Length logic doesn’t count on that). As a result set_Lenght will do some memory allocation and copying that will be actually thrown away right after that.

What’s the workaround for this?

As already mentioned – we shouldn’t generally reuse temporary local variables between short cycles with multiple iterations. That way we only cause those local variables to be promoted to higher generations and this will lead to more work for Garbage Collector (Unless we know that we will want the reuse the same variable multiple times even between different calls to the same function – then we might consider implementing some sort of Resource Pool design pattern). So we should simply use new StringBuilder in each iteration and this way we wouldn’t hit the problem.