W is for… WeakReference

W

With most of the major development technologies from Microsoft beginning with a “W” – Windows Presentation Foundation, Windows Communication Foundation, Windows Workflow Foundation – there was no shortage of choices for today’s post.  But somehow those just seemed too easy, and well, a bit large to tackle in one post.

So, from the depths of the .NET Framework, I settled on a class called WeakReference.  A weak reference is essentially a reference to an object that you can access but that is also a candidate for garbage collection.  It’s essentially a shade of grey between the absolutes of having an object in memory that has a strong reference (e.g., you’ve got a variable referencing the object) and an object that’s been marked for garbage collection but hasn’t yet been reclaimed.

By creating a weak reference to an object, you’re saying that you’d like to still be able to use it in the future, but it’s not *that* important that it can’t be reclaimed by the garbage collector – in which case, you’ll take the hit for recreating it.

Let’s take a look an example here. It’s a simplified version of the sample from the MSDN reference page, and dare I say, a bit more interesting, since it forces a garbage collection so that you can actually see the weak reference scenario kick in.

The context here is a cache, specifically a wrapper for a Dictionary<int, String>; of course, the String could be any object here.  The strings are initialized to just the word “Item” followed by their index in the dictionary.  The indexer (line 15) casts the weak reference to a strong one in line 19 to determine whether the object is still available.  If the null test (line 22) fails, we know the object has been garbage collected, and we have to create a new one (line 24).

Note, there is an IsAlive property on WeakReference which will return true if the object reference has not yet been garbage collected; however, be aware that in the instant between the positive test for IsAlive and the actual reference of the object, that object could be garbage-collected.  It’s recommended that you do the cast and then check for null as I did below.

    1: public class Cache
    2: {
    3:     Dictionary<int, WeakReference> Items;
    4:     public int Count { get { return Items.Count;} }
    5:  
    6:     String ItemFactory(int i) { return "Item " + i.ToString(); }
    7:  
    8:     public Cache(int count)
    9:     {
   10:         Items = new Dictionary<int, WeakReference>(count);
   11:         for (int i = 0; i < count; i++)
   12:             Items[i] = new WeakReference(ItemFactory(i), false);
   13:     }
   14:  
   15:     public String this[int index]
   16:     {
   17:         get
   18:         {
   19:             String d = (String) Items[index].Target;
   20:             Console.WriteLine("Object at {0,2}: {1}", index.ToString(), 
   21:                 d == null ? "Regenerated" : "Original");
   22:             if (d == null)
   23:             {
   24:                 d = ItemFactory(index);
   25:                 Items[index] = new WeakReference(d, false);
   26:             }
   27:             return d;
   28:         }
   29:     }
   30: }

Now, let’s take a look at a simple console application to show the weak references in action.  I create a Cache object with ten items, and then launch a loop that references items in the cache ‘randomly’ (using the random number generator reference created in line 3).  The twist here is that on line 13, a garbage collection is forced on every fifth reference to the cache.  That will result in all of the current weak references in the cache being garbage collected.

    1: public static void Main()
    2: {
    3:     Random r = new Random();
    4:     Cache c = new Cache(10);
    5:  
    6:     // Randomly access objects in the cache.
    7:     for (int i = 0; i < 100; i++)
    8:     {
    9:         Console.Write("{0,2}: ", i);
   10:  
   11:         String s = c[r.Next(c.Count)];
   12:  
   13:         if ((i + 1) % 5 == 0) GC.Collect();
   14:     }
   15:     Console.ReadLine();
   16: }

So, lets take a look at the output, below.  The first five lines demonstrate accessing the weak references set up in the cache constructor; no garbage collection has yet occurred.  The console outputgarbage collection occurs at the end of the fifth iteration, and all the weak references in the cache are reclaimed.  As a result, when the 3rd item is referenced at iteration 5, the strongly cast reference to

Items[index].Target

(line 19 in the first code snippet above) results in null, and a new weak reference is subsequently re-created.  On iteration 8, we get lucky, because the object at index 0 was just recreated in iteration 6.

One thing this example drives home is that the effectiveness of this approach in terms of implementing a cache pattern depends on the frequency of the garbage collection.  Here, I forced garbage collection for the sake of example, but it could well be the case that the frequency and cost of re-creating objects outweigh the benefits that the cache was to provide.  Jeffrey Richter in CLR via C# cautions:

The problem with this technique is the following: Garbage collections do not occur when memory is full or close to full.  Instead, garbage collections occur whenever generation 0 is full, which occurs approximately after every 256 KB of memory is allocated.  So objects are being tossed out of memory much more frequently than desired, and your application’s performance suffers greatly.

… Basically you want your cache to keep strong references to all of your objects and then, when you see that memory is getting tight, you start turning strong references into weak references…some people have had much success by periodically calling the Win32 GlobalMemoryStatusEx function and checking the returned MEMORYSTATUSEX structure’s dwMemoryLoad member.  If this member reports a value above 80, memory is getting tight, and you can start converting strong references to weak references based on whether you want a least-recently used algorithm, a most-frequently used algorithm, a time-base algorithm, or whatever.

If you look closely at the documentation for WeakReference, you’ll note there is a property called TrackResurrection as well as an overloaded constructor to set this value when creating the weak reference.  The default value, which I relied on above, is false, indicating a short weak reference.  The alternative, a long weak reference, allows the object to be reclaimed even after finalization

If your object has a finalizer (or destructor), the garbage collection process actually has to ‘resurrect’ the object to allow the finalizer to run.  If you have a long weak reference, you’re allowed to reclaim the object at this stage; with a short weak reference, you can’t.  Note that if you did reclaim a long reference at this point, the object has been finalized, so you may end up with an object that has an indeterminate state in terms of the unmanaged resources that the finalizer was cleaning up.  To cut to the chase, safe use of long weak references is not for the meek!