Of Strings and Services

Here's some Q&A that followed a recent internal performance talk that I thought was generally useful.

1) Is there a way to browse and cleanup allocated resources on request completion for webservice?

It's generally a very bad idea to try to mess with the collector -- only very rarely is forcing a collection a good idea.  Any sort of regular policy of forcing collections periodically during your applications execution is especially doomed to fail.  The collector is self-tuning and it will run when, according to its policies, it will be economical to do so.  A good way of thinking of this is that using a garbage collector is kind of like having batch processing approach to your memory management.  By waiting for a good amount of junk to build up you can handle it more efficiently than if you were aggressively cleaning up every little byte as it was released.

When your requests complete, as a natural part of their death, most if not all of the state associated with that request will be dead.  The memory will be reclaimed shortly afterwards as part of the next orderly collection.  This will happen when there's enough trash around that doing cleanup is economical.

If you were to try to force collections at request completion you’d basically have many different threads pounding on the collector destroying the economies of batch memory recovery.

2) I have 35% of memory allocations going into strings.  The GC now has a mechanism to specify memory pressure, how can we use this to put a ceiling on string allocations?

GC.AddMemoryPressure() and GC.RemoveMemoryPressure() [new in Whidbey] are to help the GC to understand the *unmanaged* space that is associated with an object (e.g. if many otherwise small objects own large unmanaged resources like bitmaps we should more aggressively collect).  They aren't for use with Strings which are 100% managed.  Using AddMemoryPressure() in ways other than the intended use (to inform the GC of the unmanaged cost of objects) is much more likely to get you into trouble than it is to help.

Overall comments:

35% of the allocations being strings is not especially unusual, and due to the relative ease of collecting strings (they have no embedded object references to trace) it is not especially problematic.

The real thing you have to consider is, what is the % time in GC perf counter reporting.  If this number is high (say >15%) you most likely have "mid life crisis", look at this article:

https://blogs.msdn.com/ricom/archive/2003/12/04/41281.aspx

and this one

https://blogs.msdn.com/ricom/archive/2004/02/11/71143.aspx

Reducing the overall churn rate on the GC heap is generally a good thing because it helps your objects to age more slowly, resulting in fewer collections that are more likely to be cheaper.  Use CLRProfiler to find the main source of your string allocations.  Once you know where they are coming from then you can see if you can't avoid some of them with alternate designs.  One change that often yields fruit is to avoid using String.Split and String.Substring as part of parsing your inputs.  Use the lower level methods to find offsets and then use the Comparison functions that take length and offset of string parts -- these are allocation free.

Remember to make sure that you null out any state associate with your request (specifically member variables) as soon as possible.  Sometimes strings (and other objects) assigned to "this" are kept around beyond the point they are needed.  Members will appear live to the GC because they are still reachable even though you have no intention of using them further.

Generally, see Chapter 5 of the Performance and Scalability PAG https://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/scalenetchapt05.asp for other good practices.

Other than that, it's really hard to give any meaningful advice without looking at a profile.