New In Orcas Part 3: GC Latency Modes

As you may know, there are different GC modes to choose from depending on the type of application you’re using:  Server GC, Workstation GC, and Concurrent GC (more info).  These settings are process-wide, set at the beginning of the process.  Once the GC mode is set, it cannot be changed. 
In Orcas, we’ve added the concept of GC Latency Modes that while process-wide, can be changed during the lifetime of the process to meet an application’s needs.
The Latency Modes can be accessed as new properties onto the GCSettings class:

System.Runtime.GCLatencyMode System.Runtime.GCSettings.LatencyMode { get; set; }

The values for GCLatencyMode are Batch, Interactive and LowLatency.

  • Batch:  This mode is designed for maximum throughput, at the expense of responsiveness.   It is best for applications with no UI or server-side operations and is equivalent to Workstation GC without Concurrent GC.   If Concurrent GC is enabled, switching to Batch mode will prevent any further concurrent collections.  This is the only valid mode for Server GC.
  • Interactive:  This mode balances responsiveness with throughput.  It is designed for applications with UI and is the default Latency Mode, equivalent to Workstation GC with Concurrent GC.   This mode is not available on Server GC.  
  • LowLatency:  This mode is meant for short-term, time-sensitive operations where interruptions from the GC may be disruptive, like animation rendering or data acquisition functions.  This mode is not available on Server GC.

How does LowLatency mode work?

When you set the latency mode to LowLatency, the GC will perform almost no generation 2 collections, nor will it start any new concurrent collections.  Since generation 2 is unbounded and can become very large, collecting it can cause your managed threads to pause for short amounts of time.  This can be unacceptable for certain scenarios. 

To be clear, I’m not talking about real-time application requirements, rather requirements that a short-running block of code run smoothly with minimal interruptions from the runtime. LowLatency mode is not real-time mode.

I mentioned above that in LowLatency mode, the GC will perform almost no most generation 2 collections, but there are situations when it will.  As we know, there are three things that cause the GC to perform a collection (more info):

  1. Allocation exceeds the Gen0 threshold – normally, these collections can escalate into full heap collections.  With LowLatency, generation 1 is the maximum generation that will be collected, possibly promoting objects to generation 2.
  2. System.GC.Collect is called – this will continue to work as expected.  If you specify to collect generation 2, the GC will honor your request regardless of the Latency Mode.
  3. System is in low memory situation – the OS has raised an event telling the runtime that it is low on system memory.  In this ase the GC will perform a generation 2 collection to attempt to free memory.  The alternative is to allow the OS to begin paging which will generally have worse pause times than a full collection.

How to safely use LowLatency mode

As you might have guessed, since generation 2 is rarely collected, OutOfMemoryExceptions are more likely under LowLatency mode.  Here are some guidelines to follow to avoid potential problems:

  • Keep the amount of time spent in LowLatency as short as possible.  Remember, you’re changing the behavior of the GC, which can lead to sub-optimal performance in the long-run. 
  • While in LowLatency mode, minimize the number of allocations you make, in particular allocations onto the Large Object Heap and pinned objects.
  • Be mindful of other threads that could be allocating.  Remember, these settings are process-wide, so you could generate an OutOfMemoryException on any thread that may be allocating.
  • Wrap the LowLatency code in a CER (more info).
  • Remember to set the latency mode back to avoid hard-to debug OutOfMemoryExceptions later.

Here’s a code sample of how to use LowLatency mode

// preallocate objects here
GCLatencyMode oldMode = GCSettings.LatencyMode;
RuntimeHelpers.PrepareConstrainedRegions();
try
{
    GCSettings.LatencyMode = GCLatencyMode.LowLatency;      

    // perform time-sensitive actions here

    /*
    minimize:
    -all allocations, especially LOH allocations
    -pinning
    -allocations on other threads
    */
}
catch (ApplicationException)
{
    // catch any exceptions you expect your application to throw
    // perform cleanup code
}
finally

    // always set the mode back! 
    GCSettings.LatencyMode = oldMode;
}

Remember, this mode can cause failures in your application, so please use good judgment when using it.