He’s live… he’s live not… he’s live…


I was making some code changes today and thought this was interesting to share. As you know, the WeakReference class has a getter and a setter method to get and set the Target which is what the weakref points to. See Using GC Efficiently – Part 3 for more details on WeakReference.


 


Note that the code below is only for illustration purposes – it is not necessarily what’s in the production code.


 


So let’s say the code used to look like this in the WeakReference class:


 


internal IntPtr m_handle;


 


public Object Target


{


    get


    {


        IntPtr h = m_handle;


        if (IntPtr.Zero == h)


            return false;


 


        Object o = GCHandle.InternalGet(h);


        h = Thread.VolatileRead(ref m_handle);


        GC.KeepAlive (this);


        return (h == IntPtr.Zero) ? null : o;


    }


 


   


}


 


m_handle is the weak GCHandle that we create to implemente the weakref funtionality. It’s a weak handle that points to the object that you want your weakref object to point to.


 


The problem is Thread.VolatileRead kind of a heavy weight thing – not very performant (there was a reason why we used this API in the first place…not necessarily a good one but it’s what we ended up with).


 


First of all let’s take a look at the old code. Notice that we have a GC.KeepAlive in there. Why would we do that? I mean if during the call of Thread.VolatileRead, the weakref object is dead and its finalizer sets m_handle to 0, we’d just read 0. That’s fine right?


 


But imagine this code:


 


Object o = new Object();


 


while (true)


{


    WeakReference wr = new WeakReference (o);


 


    if (wr.Target == null)


    {


       


    }


}


 


o.GetHashCode();


 


We know that o is live during the while loop which means it would be really nice if wr.Target is null (some would consider it a bug if wr.Target was ever null in this case). If we didn’t have the KeepAlive, object wr could be considered dead as soon as its address is passed in to the  Thread.VolatileRead call as the argument. So then h could be 0 and we’d return null. So we want to make sure that if the object the weakref points to is guaranteed to be live during the getter call, you will always get back that object instead of null.


 


Wouldn’t that achieve the same effect since m_handle is an instance data member and if it’s used, the *this* object should be kept alive where the last statement is?


 


Well, actually since m_handle is not a volatile, jit could generate some code that stores the value of m_handle in a register so it will not need to read the value of m_handle again when it’s at that last statement.


 


Another interesting thing about this code is that it does the IntPtr.Zero check after it first read the value of m_handle. But how can m_handle be 0? m_handle is only set to 0 in the WeakReference class’s finalizer code. If we are already KeepAlive-ing the weakref object, it means the object should be live therefore by definition the finalizer should have not been run, right?


 


Well, unfortunately there is a case where m_handle can be 0 while we are in the getter which is when the getter is called in an object’s finalizer. Imagine you have this object hireachy:


 


public class ObjectA


{


    WeakReference wr;


 


    public ObjectA(WeakReference wr0)


    {


       


        wr = wr0;


    }


 


   


 


    public ~ObjectA()


    {


       


        if (wr.Target == null)


        {


           


        }


       


    }


}


 


ObjectA a = new ObjectA();


 


When a is not used anymore, at some point both a’s and wr0’s finalizer (assuming a is the only object that contains a reference to wr0) will be put on the finalize queue.


 


Now if a’s finalizer gets to run first, we are fine ‘cause when we are in wr0’s getter, m_handle is still valid. But if wr0’s finalizer gets to run first, then when a’s finalizer is run, m_handle is already set to 0. Of course as we’ve been saying that a finalizer should do no more than releasing native resources, this shouldn’t be a common scenario.


 


So, the idea is to change m_handle to volatile and eliminate the need for calling Thread.VolatileRead. The resulting code looks like this:


 


internal volatile IntPtr m_handle;


 


public Object Target


{


    get


    {


        IntPtr h = m_handle;


        if (IntPtr.Zero == h)


            return null;


 


        Object o = GCHandle.InternalGet(h);


        return (m_handle == IntPtr.Zero) ? null : o;


    }


 


   


}


 


Notice that KeepAlive is gone because we are reading a volatile value which means the weakref object will be kept alive though out the getter. We still need to check if h is IntPtr.Zero at the beginning because we are still subject to be called from another object’s finalizer.


 


Some people don’t use volatile’s in fear of losing performance ‘cause volatile’s can’t be optimzed by the compiler. In reality though, if you read the volatile into a local when you need to access it frequently and that you are fine with the cached value, no reason to be afraid of using volatile’s.

Comments (10)

  1. dal says:

    Interesting. Couple of questions/observations…

    This is nitpicking, but…

    In the getter for property Target,

    if (IntPtr.Zero == h)

       return false;

    should probably be

    if (IntPtr.Zero == h)

       return null;

    Also, you state that if m_handle is not volatile then the jit could generate some code that caches the value of m_handle in a register, thereby subjecting it to garbage collection; thus, you simply cannot refer to m_handle after the call to Thread.VolatileRead to ensure that the "this" is kept alive till after that statement.

    This may be something that we must do but I question whether we should need to do so. This requires us to have very intimate knowledge about a part of the runtime (optimizations that JIT performs) that ought to be opaque. This makes it very difficult to reason about the correctness of the code. If inspection of the code makes it appear that a reference to "this" follows a method call, then the JIT should respect that and ensure that the object is kept alive till after that LOC executes. In other words, if the JIT wants to optimize and cache values in registers, it should also insert the equivalent of logical calls to KeepAlive(this) on our behalf. We would not then need to use the volatile modifier on instance members.

  2. Maoni says:

    dal, you misunderstood. I said if m_handle is not volatile then JIT could cache the value in the register so it will not need to read it again. So the object doesn’t need to be live.

  3. dal says:

    Maoni, I thought I did understand.

    I understand that currently the JIT may cache a non-volatile value in a register and not need to read it again. In this case the JIT no longer needs to have access to the underlying object from which it obtained the value, so the object does not explicitly need to be "live".

    However, this is up to the JIT – it could report the object alive until after the line of code that explicitly references the object, even if the JIT does not need the object in order to determine the value of the cached item. In other words, the determination of object lifetime is up to the JIT.

    IMO it ought to report the object alive until the last explicit reference to the object.

    If the JIT does not report the object as live until after the last explicit reference then it makes it nearly impossible to reason about object lifetime, as developers and reviewers will need to know both the documented rules of object lifetime in .NET and the undocumented caching behavior as implemented in the JIT. This could be version sensitive, and certainly varies with debug versus release builds.

    All the information I’ve read on object lifetime always discussed when the last reference to "this" occurred. The concern has been about when the last piece of data was extracted from "this" so that "this" was no longer needed. Adding caching behavior to this seems new to me.

    I only use volatile when multiple threads may access a field for reading/writing and the variable is not protected by a lock statement; this is part of reasoning about thread safety. If I understand you correctly, this may also affect object lifetime, as a side-effect of the JIT being unable to cache the value ina a volatile field.

    Please instruct me where my understanding is incomplete.  

    Thanks.

  4. nativecpp says:

    Hi Maoni,

    I was told that it is a good idea to null m before calling LongFoo. But I thought that GC can figure it out itself since m is a local var and should know that m is no longer referenced

    public void foo()

    {

      // allocate large memory

      HughMem m  = new HughMem();

      …

      …

      // do a long process and GC got trigger here

      LongFoo();

      // m is referenced anymore

      …

      …

    }

  5. nativecpp says:

    I meant // m is NOT referenced anymore

  6. dal, the point is you shouldn’t need to worry about the object lifetime yourself (most of the time anyway) ’cause JIT will do a correct analysis and keep things that need to be live live.

    nativecpp, yes, JIT (not GC 🙂 this is not GC’s job)will try very hard to determine when something is not used anymore it can be considered dead *before* it reaches the end of the scope. But it can’t guarantee that as soon as something is not used anymore it can always immediately discover that. You can help JIT by settings refs to null so JIT knows immediately it can be considered dead.

  7. nativecpp says:

    Yes. You are right that it is JIT not GC :-((

    One of the C# FAQs (http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx) stated that there is no need to set to NULL. And yet there are some who state it should set to NULL. I always thought there is no need to set to NULL for local variables as mentioned in the FAQ. There is time when you need to set to NULL as Rico stated in the comments (http://blogs.msdn.com/ricom/archive/2003/12/04/41281.aspx).

    That is why I am confused :-((

    I know that you are not in JIT team. But, I thought JIT would traverse at least to the end of function block and therefore would know if the local var is reachable or NOT when GC occurs.

    Thanks

  8. nativecpp says:

    My other thought was that what happened when GC occurs at

    m  = null;  // gc got triggered here

    Does JIT think m is still rooted ? Or does it have a special logic to ignore this assignment and therefore m will be considered as candidate for GC ???

    Thanks

  9. nativecpp, JIT is allowed to extend the lifetime of any local to the end of the method. The C# FAQ you mentioned is incorrect. For example, JIT doesn’t track all locals if there are too many (of course most methods do fall under the "not too many locals" case).

    The point is you shouldn’t count on JIT tracking the locals preciously. If you want to make sure a local is dead in the middle of a method you should manually set it to null. Most of the time you don’t really need to do this – I would only do this if I have an object ref that associates with a lot of resources and there’s a long way to go till end of the method.

  10. nativecpp says:

    Thanks for the clarification.