WeakReference collections and the heavy WeakReference class


Briefly diverging from regular posts: here’s something really fun. As far as I know, prior to 4.0, .Net has had, two built in ways to do weak object references.

1) WeakHandle

2) GCHandle, which is what WeakHandle uses under the covers.

If you have ever tried to use WeakHandle in a collection, e.g. attempt to maintain a ‘weak set’ or a ‘weak list’ as you might want in a weak event notification scheme, and then tried to stress test your solution for scalability, you find some slightly interesting problems.

 

Problem 1. How do you keep the collection clean of dead entries, which are otherwise a memory leak?

This is actually a tricky problem. We’re using a List<WeakReference> to track event listeners. Even though the listener in the weak event notification scheme got collected, and the WeakReference.Target property pointing to the listener now returns null, the WeakReference itself is still an object, which sucks up memory by virtue of being stuck in our list. Perf geeks will also notice the WeakReference object pointers can have poor locality of reference.

Anyway, there are three basic approaches I know of so far

Approach one is search for dead entries and evict them in response to regular calls, e.g. every time you search the list, or every time you add a new entry. If your list were large, this could potentially make searching or adding items quite slow.

Approach two is to have a thread pool or dispatcher work item to do eviction periodically. If you take the thread pool approach, you need to think about thread safety. If you take the dispatcher approach, you are binding yourself to client scenarios, which have a dispatcher loop, and running into other questions such as what priority should eviction happen at?

Approach three is the ‘Who cares, this doesn’t apply to my app!’ approach. Ignore the problem, as in many apps we might not have really large listener subscription collections for a long enough time to increase our overall memory profile. (Yes your app is failing some kind of stress tests, but they are artificial scenarios.)

 

Problem 2. WeakReference is an object

Being an object can be bad for performance as I hinted above. To explain:

Imagine that we can use an array of WeakReference[] and that WeakReference is a struct. Then iterating through the array, and dereferencing WeakReference.Target, all the different Target fields in the WeakReference structures should have good locality of reference. The targets of the weak references may have poor locality of reference. But there is probably nothing we can do about that.

On the other hand if we use an array of WeakReference[] and the WeakReference is a class. Then iterating through the array, we first have to dereference array[i] to get a WeakReference w. Then we also have to dereference WeakReference.Target, the target of the weak reference. Short story, we have two unknown locality pointer dereferences instead of one.

Yes, but all this has nothing to do with stress, perf and memory leaking… right? Yes and no. Yes: it does nothing to address the problem of keeping the collection clean of dead entries. But, for No: it has everything to do with general perf

If we are using approach one, then the locality of reference issue would be making all our array searches slower. (But hopefully our scans evict things well enough that we don’t actually run out of memory.)

If we are using approach two, then the locality of reference issue would be making all our eviction scans slower. And we might be locking during the scans.

If we are using approach three, then… it’s interesting.

For all three cases – is it literally cheaper, to own or to leak a struct entry allocated in an array, compared to an object on a heap? Well, the answer is yes! Objects have overhead. The heap doesn’t have to individually allocate array entries.The garbage collector doesn’t have to track array entries. But we will still run out of memory eventually.

(For more discussion of arrays of classes/structs and when to use which here’s a reference.)

Problem 3. WeakReference is finalizable (has a finalizer ~WeakReference())

We just mentioned that being an object incurs overhead. But being a finalizable object incurs even more overhead. This is why there are a lot of guidelines telling you never to write finalizers if you can avoid it. Seriously.

What overhead does it incur?

1) Your object goes on the ‘needs finalizing someday’ queue. (Calling GC.SuppressFinalize

2) Your object eventually gets transferred from the ‘needs finalizing someday’ queue onto the ‘ready to finalize’ queue once there are no more references to it. This queue probably gets processed on a background thread.

3) During finalization anything your object references can’t be finalized. [For this WeakReference scenario, being finalized may happen after it doesn’t refer to any live objects.]

For more detailed discussion of finalizers, there’s a good set of comprehensive guidelines via Joe Duffy’s blog, and general info via cbrumme’s blog. You can read all of that, and it’s basically going to tell you two things. Finalization is necessary for unmanaged resources. Finalization is probably a bad idea for most other cases.

If you read carefully it is mentioned there are legitimate use cases for finalization doing things in managed code, but something I want to point out right now based on my recent at home experiments is this.

You may think using finalizers would be a clever way to figure out that there are some resources you can go clean up properly now. But you really don’t want to be creating a lot of finalizable objects. As in a potentially unbounded number. Because there is so much overhead in creating finalizable objects it is a huge perf issue regardless of whether you have the memory leaks or not. This, to me, makes Arrays, Lists, or Dictionaries based on WeakReference probably a bad idea in practice.

Using this intuition, from here on, I would be highly suspicious of any proposed object graph such as this, where the ‘N links’* (set of N object references) are claimed to all be weak references using System.WeakReference.

image

Note that I intentionally qualified that statement by explicitly naming System.WeakReference as being just too heavy. There may indeed be some better alternatives…


Comments (1)

  1. In playing around with the possible solutions or improvements to [previous post:] the heaviness of finalizable

Skip to main content