WP7 Mango: The new Generational GC


In my previous post “Mark-Sweep collection and how does a Generational GC help” I discussed how a generational Garbage Collector (GC) works and how it helps in reducing collection latencies which show up as long load times (startup as well as other load situations like game level load) and gameplay or animation jitter/glitches. In this post I want to discuss how those general principles apply to the WP7 Generational GC (GenGC) specifically.

Generations and Collection Types

We use 2 generations on the WP7 referred to as Gen0 and Gen1. A collection could be any of the following 4 types

  1. An ephemeral or Gen0 collection that runs frequently and only collects Gen0 objects. Object surviving the Gen0 collection is promoted to Gen1
  2. Full mark-sweep collection that collects all managed objects (both Gen1 and Gen0)
  3. Full mark-sweep-compact collection that collects all managed objects (both Gen1 and Gen0)
  4. Full-GC with code-pitch. This is run under severe low memory and can even throw away JITed code (something that desktop CLR doesn’t support)

The list above is in the order of increasing latency (or time they take to run)

Collection triggers

GC triggers are the same and as outlined in my previous post WP7: When does the GC run. The distinction between #2 and #3 above is that at the end of all full-GC the collector considers the memory fragmentation and can potentially run the memory compactor as well.

  1. After significant allocation
    After significant amount of managed allocation the GC is started. The amount today is 1MB (called GC quanta) but is open to change. This GC can be ephemeral or full-GC. In general it’s an ephemeral collection. However, it might be a full collection under the following cases
    1. After significant promotion of objects from Gen0 to Gen1 the collections become full collections. Today 5MB of promotion triggers a full GC (again this number is subject to change).
    2. Application’s total memory usage is close to the maximum memory cap that apps have (very little free memory left). This indicates that the application will get terminated if the memory utilization is not cut-back.
    3. Piling up of native resources. We use different heuristics like native to managed memory ratio and finalizer queue heuristics to detect if GC needs to turn to full collection to release native resources being held-up due to Gen0 only collections
  2. Resource allocation failure
    All resource allocation failure means that the system is under memory pressure and hence such collections are always full collection. This can lead to code pitch as well
  3. User code triggered GC
    User code can start collections via the System.GC.Collect() managed API. This results in a full collection as documented by that API. We have not added the method overload System.GC.Collect(generation). Hence there is no way for the developer to start a ephemeral or Gen0 only collection
  4. Sharing server initiated
    Sharing server can detect phone wide memory issue and start GC in all managed processes running. These are full-GC and can potentially pitch code as well.

 

So from all of the above, the 3 key takeaways are

  1. Low memory or memory cap related collections are always full-collections. These could also turn out to be the more costly compacting collection and/or pitch JITed code
  2. Collections are in general ephemeral and become full-collection after significant object promotion
  3. No fundamental changes to the GC trigger policies. So an app written for WP7 will not see any major changes to the number of GC’s that happen. Some GC will be ephemeral and others will be full-GCs.

 

Write Barriers/Card-table

As explained in my previous post, to keep track of Gen1 to Gen0 reference we use write-barrier/card-table.

Card-table can be visualized as a memory bitmap. Each bit in the card-table covers n bytes of the net address space. Each such bit is called a Card. For managed reference updates like  A.b = C in addition to JITing the real assignment, calls are added to Write-barrier functions. This  write barrier locates the Card corresponding to the address of write and sets it. Later during collection the collector checks all Gen-1 objects covered by a set card-bit and marks Gen-0 references in those objects.

This essentially brings in two additional cost to the system.

  1. Memory cost of adding those calls to the WB in the JITed code
  2. Cost of executing the write barrier while modifying reference

Both of the above are optimized to ensure they have minimum execution impact. We only JIT calls to WB when absolutely required and even then we have an overhead of a single instruction to make the call. The WB are hand-tuned assembly code to ensure they take minimum cycles. In effect the net hit on process memory due to write barriers is way less than 0.1%. The execution hit in real-world applications scenarios is also not in general measureable (other than real targeted testing).

Differences from desktop

In principle both the desktop GC and the WP7 GC are similar in that they use mark-sweep generational GC. However, there are differences based on the fact that the WP7 GC targets a more constrained device.

  1. 2 generations as opposed to 3 on the desktop
  2. No background or incremental collection supported on the phone
  3. WP7 GC has additional logic to track and handle application policies like application memory caps and total memory utilization
  4. The phone CLR uses a very different memory layout which is pooled and not linear. So no concept of Large Object Heap. So lifetime of large objects is no different
  5. No support for particular generation collection from user code
Comments (6)

  1. Wil says:

    Any idea if/when this GC change is coming to .NET CF (not WP7)?  I ask because I have a high performance app that runs on a dedicated CE6/.NET CF 3.5 platform that has game like characteristics, and suffers from the GC latencies that you describe.  Silverlight and XNA are not suitable for the application.  We also suffer from the lack of profiling tools / difference between profiling on CF vs. the full framework (as mentioned in comments on your previous post).

  2. Tony says:

    Has the Windows Phone Team actuallyl consider Opt-out of Garbage Collection? I would like to manage my ressources myself, because I know best what to do and WHEN to clean up 🙂

  3. Tony not sure what you are asking here. If you are referring to a non-GC'd system, you are referring to native programming. E.g. C++, so for that the answer is currently there is no such plans to support it.

    If you are referring to a non-GC'd system inside .NET then the answer is it's not possible to do that, because "manage my resources" doesn't really mean anything. If you call a API and get back a string, who owns it, who is responsible to de-allocate it? Is it "my", "your" or "theirs". Is the caller, callee or maybe downstream .NET BCL that allocated it should de-allocate.

    "because I know best", that unfortunately is not 100% correct. Native programmers manager their own resources, and it's extremely hard (if not the hardest) problem in any system of non-trivial sizes. E.g. you said "WHEN to cleanup", what about WHO cleans up? For every API returning, accepting objects, object lifetime contract as to who/when/cloning (deep vs shallow) has to be developed and adhered to. The hardest to fix bugs is memory leak and corruption issue.

    So what i am trying to say is native programming is not supported today and there is no plans to do so right now. Managed programming without GC doesn't exist because the CLR specification doesnt support it. GC is "the biggest" value addition of managed programming.

  4. Vijay says:

    Very good and informative article, Abhinaba. I have 1 question. When you state that "No background or incremental collection supported on the phone " ..

    Does that mean that even with Generational GC, when the GC occurs, all the applications are paused ?

  5. Vijay, GC is per-process. So the managed execution in the process in which the GC is happening is paused.

    Background GC is a way in which even the current process can execute "most" of the time when there is a GC happening in that process. This is supported on the desktop but not on the phone….

  6. Todd says:

    Hi, I'm wondering about the guidance around calling GC.Collect() manually. I keep hearing that it shouldn't be done or their is no need to do it. But in one of my apps I will certainly go over ApplicationPeakMemoryUsage of 90MB if I don't call it regularily.

    It's not an overly complex app – I just have a bunch of pictures in IsolatedStorage that I bring up and display, one after the next. I maintain a constant cache of WriteableBitmaps from .LoadJPG of currentPic, previousPic and nextPic (i.e. only those three are assigned to varialbles, so that when you click "Next", the currentPic will become the previousPic and the nextPic will become the currentPic and be displayed in an Image control and a new nextPic is brought out of isolated storage). Pretty simple.

    If I don't call GC.Collect() after each new re-assignment of the pic variables, the ApplicationPeakMemoryUsage keeps growing and will eventually hit the 90 MB limit. If I call it with each new pic, I maintain a pretty low ApplicationPeakMemoryUsage of 25-27 MB.

    So, back to my question – is calling GC.Collect() recommended in some situations (like above)? What about GC.WaitForPendingFinalizers()?