When do I need to use GC.KeepAlive?


Finalization is the crazy wildcard in garbage collection. It operates "behind the GC", running after the GC has declared an object dead. Think about it: Finalizers run on objects that have no active references. How can this be a reference to an object that has no references? That's just crazy-talk!

Finalizers are a Ouija board, permitting dead objects to operate "from beyond the grave" and affect live objects. As a result, when finalizers are involved, there is a lot of creepy spooky juju going on, and you need to tread very carefully, or your soul will become cursed.

Let's step back and look at a different problem first. Consider this class which doesn't do anything interesting but works well enough for demonstration purposes:

class Sample1 {
 private StreamReader sr;
 public Sample1(string file) : sr(new StreamReader(file)) { }
 public void Close() { sr.Close(); }
 public string NextLine() { return sr.ReadLine(); }
}

What happens if one thread calls Sample1.NextLine() and another thread calls Sample1.Close()? If the NextLine() call wins the race, then you have a stream closed while it is in the middle of its ReadLine method. Probably not good. If the Close() call wins the race, then when the NextLine() call is made, you end up reading from a closed stream. Definitely not good. Finally, if the NextLine() call runs to completion before the Close(), then the line is successfully read before the stream is closed.

Having this race condition is clearly an unwanted state of affairs since the result is unpredictable.

Now let's change the Close() method to a finalizer.

class Sample2 {
 private StreamReader sr;
 public Sample2(string file) : sr(new StreamReader(file)) { }
 ~Sample2() { sr.Close(); }
 public string NextLine() { return sr.ReadLine(); }
}

Remember that we learned that an object becomes eligible for garbage collection when there are no active references to it, and that it can happen even while a method on the object is still active. Consider this function:

string FirstLine(string fileName) {
 Sample2 s = new Sample2(fileName);
 return s.NextLine();
}

We learned that the Sample2 object becomes eligible for collection during the execution of NextLine(). Suppose that the garbage collector runs and collects the object while NextLine is still running. This could happen if ReadLine takes a long time, say, because the hard drive needs to spin up or there is a network hiccup; or it could happen just because it's not your lucky day and the garbage collector ran at just the wrong moment. Since this object has a finalizer, the finalizer runs before the memory is discarded, and the finalizer closes the StreamReader.

Boom, we just hit the race condition we considered when we looked at Sample1: The stream was closed while it was being read from. The garbage collector is a rogue thread that closes the stream at a bad time. The problem occurs because the garbage collector doesn't know that the finalizer is going to make changes to other objects.

Classically speaking, there are three conditions which in combination lead to this problem:

  1. Containment: An entity a retains a reference to another entity b.
  2. Incomplete encapsulation: The entity b is visible to an entity outside a.
  3. Propagation of destructive effect: Some operation performed on entity a has an effect on entity b which alters its proper usage (usually by rendering it useless).

The first condition (containment) is something you do without a second's thought. If you look at any class, there's a very high chance that it has, among its fields, a reference to another object.

The second condition (incomplete encapsulation) is also a common pattern. In particular, if b is an object with methods, it will be visible to itself.

The third condition (propagation of destructive effect) is the tricky one. If an operation on entity a has a damaging effect on entity b, the code must be careful not to damage it while it's still being used. This is something you usually take care of explicitly, since you're the one who wrote the code that calls the destructive method.

Unless the destructive method is a finalizer.

If the destructive method is a finalizer, then you do not have complete control over when it will run. And it is one of the fundamental laws of the universe that events will occur at the worst possible time.

Enter GC.KeepAlive(). The purpose of GC.KeepAlive() is to force the garbage collector to treat the object as still live, thereby preventing it from being collected, and thereby preventing the finalizer from running prematurely.

(Here's the money sentence.) You need to use GC.KeepAlive when the finalizer for an object has a destructive effect on a contained object.

The problem is that it's not always clear which objects have finalizers which have destructive effect on a contained object. There are some cases where you can suspect this is happening due to the nature of the object itself. For example, if the object manages something external to the CLR, then its finalizer will probably destroy the external object. But there can be other cases where the need for GC.KeepAlive is not obvious.

A much cleaner solution than using GC.KeepAlive is to use the IDisposable interface, formalized by the using keyword. Everybody knows that the using keyword ensures that the object being used is disposed at the end of the block. But it's also the case (and it is this behavior that is important today) that the using keyword also keeps the object alive until the end of the block. (Why? Because the object needs to be alive so that we can call Dispose on it!)

This is one of the reasons I don't like finalizers. Since they operate underneath the GC, they undermine many principles of garbage collected systems. (See also resurrection.) As we saw earlier, a correctly-written program cannot rely on side effects of a finalizer, so in theory all finalizers could be nop'd out without affecting correctness.

The garbage collector purist in me also doesn't like finalizers because they prevent the running time of a garbage collector to be proportional to the amount of live data, like say in a classic two-space collector. (There is also a small constant associated with the amount of dead data, which means that the overall complexity is proportional to the amount of total data.)

If I ruled the world, I would decree that the only thing you can do in a finalizer is perform some tests to ensure that all the associated external resources have already been explicitly released, and if not, raise a fatal exception: System.Exception.Resource­Leak.

Bonus reading

Comments (34)
  1. Leo Davidson says:

    Can that finalizer even know that sr is a live object? It may have already been finalized and garbage collected.

    It's a while since I used .Net but I remember one of the rules was that finalizers for a class are not allowed to access any managed objects contained in the class for that reason.

    IMO, the line in blue should trigger a compiler warning. (Maybe it does already.)

  2. Jeroen Pluimers says:

    s/tread/thread/

    [In this case, I actually did mean "tread". -Raymond]
  3. Jason says:

    So, in this poorly designed program, is the correct behavior to modify NextLine() like this?

    public string NextLine() { string result = sr.ReadLine(); GC.KeepAlive(this); return result; }

    [Personally, I would remove the finalizer. -Raymond]
  4. Stephen Cleary says:

    @Leo: You're correct. Adding GC.KeepAlive will only work if AppDomain.IsFinalizingForUnload is false. And stays false throughout the finalizer. Which can't be checked.

  5. Mihai says:

    @Zarat: how could the reference still be there if the memory was recollected (compacted and addresses changed) at a previous GC collection (on a lower generation, just some objects freed, not a full GC)? Is there any guaranty that objects that refer each other will be finalized and memory freed at the same GC collection?

  6. Michael Stone says:

    Fascinating article.  Makes me pine for the days of unmanaged code, when it seemed like the rules were at least clear.  Then I go back to my unmanaged code base and stop pining pretty quickly.

  7. RobSiklos says:

    In order to help squash memory leaks, I have a static MemoryTracker class which keeps weak references to any object which I register to it.

    I use finalizers in my objects to tell the MemoryTracker to unregister that object.

    At the end of the day, I can query my MemoryTracker and it will give me the list of all registered objects which are still "in memory".

    Would you consider this an appropriate use of finalizers?

    P.S.  All finalizers are enclosed in "#if (DEBUG)", so that my release builds don't get any performance hit from running finalizers.

  8. Stephen Cleary says:

    @Zarat: Almost the only thing a finalizer can do is p/Invoke a native free-resource function. AppDomain.IsFinalizingForUnload and Environment.HasShutdownStarted only indicate that you're probably doing something wrong in your finalizer. I can't think of a single use case for them.

    @Rob: That's not appropriate; sorry. You should not access any managed code from a finalizer (there are only a couple of exceptions to this rule, such as Console).

  9. Mmmh says:

    @Rob263

    Why ?

    If your objects have references to disposable resources either choose one (if the resource was not already disposed): free the resource (as the guidelines say) or throw/assert to signal the anomaly (Raymond says to throw, I prefer an assert here).

    If your objects do not have references to disposable resources, you are debugging the CLR..

  10. Joshua says:

    There's one other good use for GC.KeepAlive(): optimization barrier.

    When implementing double checked locking or certain thread-safe datastructures, controlling when local variables get written to class or global variables is a must. To prevent the optimizer from breaking it for me I insert a call to GC.KeepAlive(classvariable) right before classvariable=localvariable.

  11. Daniel Earwicker says:

    "If I ruled the world, I would decree that the only thing you can do in a finalizer is perform some tests to ensure that all the associated external resources have already been explicitly released, and if not, raise a fatal exception: System.Exception.Resource Leak."

    Then I say: VOTE FOR RAYMOND!

  12. RobSiklos says:

    @Stephen, @Mmmh:  I should note that I'm not using disposable objects or unmanaged resources.  I want to know which of my objects aren't being collected because someone is still referencing them (then I can use windbg to find out who).  For instance, objectA subscribes to an event on objectB.  Now, objectA can't be collected until objectB is no longer referenced (or never, if the event is static).

  13. mikeb says:

    @Rob263:  You might want to consider making your MemoryTracker so that instead of relying on finalizers to unregister objects that it would periodically perform it's own 'garbage collection'.  when certain timer intervals have passed (or some other heuristic) and when certain operations are performed on the MemoryTracker (like liating the objects that are still alive), have it go through it's collection of WeakReferences discarding any that have a Target of NULL (or !IsAlive – this seems like a situation where the IsAlive property is legitimately useful – it would allow you to drop known dead WeakReferences without needlessly creating a strong reference to an object that might be just about to be GC'd).

    Of course, your MemoryTracker should already be doing at least some of this since finalizers aren't guaranteed to be called and since MemoryTracker is already dealing with WeakReferences it already has to handle the possibility that Target could be null even if the object weren't deregistered.

  14. RobSiklos says:

    @mikeb – yes – you are 100% correct – since MemoryTracker already checks IsAlive and drops known dead references, I don't really need the finalizers at all.

  15. kojiishi says:

    This is interesting. I understand objects are eligible to be collected while its method is executed, but I still don't know why CLR team took such design.

    Given that, even if I'm using "using" pattern, GC could collect the object while Dispose() is executing, so I guess "using" isn't still safe.

    But forcing every developer to use GC.KeepAlive() on every last method call in a function doesn't make a good sense.

    Do you have any idea why CLR team took such design? Couldn't this be changed and make "this" to be a reference?

  16. Zarat says:

    @kojiishi

    "this" doesn't have to be left anywhere on the stack, so the GC has no knowledge that "this" may still be in use.

    At another note and before I forget, after AppDomain shutdown doesn't mean the OS will clean up, because you can have more than one AppDomain per process. So if you mess up during shutdown but are using multiple AppDomains you are generating memory leaks ;)

  17. Joshua says:

    @kojishi

    Actually, the using block uses the variable implicitly in the closing brace. If you disassemble you get a normal try/finally so GC.KeepAlive doesn't make a difference.

    If you think hard enough about it you would realize that if this were an issue adding GC.KeepAlive inside the using block wouldn't help anyway.

  18. Trillian says:

    Is there an MSDN article or a part of the CLI specification that says precisely under which circumstances an object is allowed to be reclaimed by the GC? I don't remember reading that objects can be collected it has references to it, but they are not "active references". I've tried and failed to reproduce a situation where an object is reclaimed while one of its methods is executing. Does that mean that MS's implementation doesn't do that but another implementation could, or I've just been lucky?

    [3.9 Automatic memory management. Clause 2 is the important one here. (And you've just been lucky.) -Raymond]
  19. Zarat says:

    Yes it can have been finalized, but the reference should still be there (as far as I know the GC doesn't reset references, after all Sample1 is about to be collected too). Also closing a finalized StreamReader is a no-op. So it should work.

    Though it is another question if it makes sense. Leo is totally right, the example makes no sense. The (documented) finalizer rule he mentioned says StreamReader must finalize itself and not someone outside. A finalizer disposing another managed object is usually something you don't want to do (StreamReader.Close is an alias for its IDisposable.Dispose implementation).

    IMHO GC.KeepAlive should only ever be used by the object containing the finalizer. Since the object knows what its finalizer does it also knows what methods need to protect an object from being collected before the method finishes. The alternative is having *every* caller doing GC.KeepAlive, which is plain stupid and something you don't want to rely on.

    And on a last note, writing proper finalizers where it matters (ie. for unmanaged objects/resources) is *hard* because of one thing: Application shutdown. During shutdown everything is collected at once, meaning all finalizers run in random order and you can't rely on any managed classes anymore. During runtime you can control finalizing order by having GCRoots or static references, but during shutdown there's nothing you can do if finalizing order matters for the unmanaged resource. In WinForms there's at least an event for shutdown, but thats just WinForms and not generally available. So the ugly hack I'm doing is to check in every finalizer if the AppDomain is shutting down (Environment.HasShutdownStarted) and if so just do nothing and hope the OS cleans up better than I can. I wished Microsoft had separated runtime GC and shutdown GC better, but I suppose its too late now :(

    @Jason: Yes it is.

  20. JoeWoodbury says:

    I like Raymond's suggestion of the finalizer throwing an exception if it has to actually do something. I have a .NET assembly that references an unmanaged DLL. Odds are a managed code thread is in a wait in the unmanaged code. Using IDisposible, I can tear the whole thing down very gracefully. I'm not so confident I have the finalizer right, though my tests haven't exposed a problem. I did add a section to the documentation in big red letters that this object should be used in an using statement or other explicitly Disposed when no longer needed.

  21. Marquess says:

    I, for one, welcome our old new backwards compatible overlord!

  22. Zarat says:

    @Mihai

    The address can't change during compaction without the references being updated or nulled, otherwise you'd have created a giant security hole when the (badly written) finalizer tries to access the reference which points to an invalid location. I fully agree with you that there is no guarantee that managed references are not nulled, in my first post I just said that in my observations it doesn't happen with the current GC. Of course this is an implementation detail and may change anytime.

    @Stephen

    Except for shutdown it is perfectly valid to do more than P/Invoke in finalizers. The WinForms library does it a lot. What you have to do is to guarantee that the references you care about are not already collected. To do this you create a GCRoot or put them in a static collection. This effectively creates a strict order on finalizers, because objects I've put in a static collection are *guaranteed* to not be collected. So in my finalizer I can do anything I want with them, then remove them from the static collection, enabling them for GC in the next cycle.

    Such ordered finalizing is not just theory, it is very important whenever you have a hierarchic structure of managed resources and must ensure the children are released before the parents. In some native APIs, after you released the parent, the children will become invalid too, but your managed classes don't know about it, and trying to finalize the children after the parent will crash, or worse, release something unrelated which happens to use the same unmanaged resource slot/pointer/handle.

    I don't remember exactly where I've seen it in WinForms, but I think it was related to HWNDs and/or callbacks. One place where it definitely is required to finalzie in-order is the native API to read type libraries (TLBs) because once you release the root COM interface it automatically releases all interfaces describing TLB content. Yes this violates the COM rules but that's no reason to leak (or crash if you try to cleanup properly in a finalizer when the child was auto-released).

    The point where this all stops to work is shutdown. Suddenly all static collections and GCRoots are collectible too, so no more ordering of finalizers. For the typelib API it will finalize in random order and usually crash because there's a high chance that the root interface is finalized somewhere in between its children. For WinForms this shutdown problem is why they have the Application.ApplicationExit event, allowing to properly cleanup *before* shutdown comes with it madness of finalizer calls. This works only because the usual WinForms pattern is to start the Application in Main() and let it take over, so it knows once the top level Application.Run() returns it should better have cleaned up. But you don't have that in the general case, and if you write a library you don't have any shutdown notification and can't do much more than skip finalizers during shutdown by checking Environment.HasShutdownStarted

    Bleh this was a long explanation :(

  23. klhuillier says:

    This does not seem like appropriate behavior for a garbage collector. I tried it out in both .NET 3 and Java 6, and neither the CLR nor the JVM would finalize the object while a thread is using a method of that object. They would finalize an object if there is no local variable reference in scope. Perhaps my test code was flawed?

    One reason why this would be inappropriate is if an object becomes a candidate for finalization, so might some of its referenced fields. This could leave the method that is currently running in a state where it will invoke methods on finalized objects. Furthermore, if the object's own finalizer disposes of its fields, the method will be stuck with an invalid state.

    I hope this was never the garbage collector's behavior in .NET, but it sounds like it was. Yet another reason to avoid finalizers (except in the cases of native code where the managed runtime can't clean up, in which case tread carefully).

    [Try this program:

    class Program {
      public static void Main()
      {
         Program p = new Program();
         p.GCMe();
      }
      public void GCMe()
      {
         System.GC.Collect();
         System.Threading.Thread.Sleep(1000);
         System.Console.WriteLine("returning from GCMe");
      }
      ~Program()
      {
         System.Console.WriteLine("finalized");
      }
    }
    

    This prints
    finalized
    returning from GCMe
    showing that the object was finalized while the GCMe method was still running. -Raymond
    ]

  24. Angstrom says:

    I'm impressed at the number of people who have misinterpreted what can happen to 'this' during method evaluation.

    If your method (or your caller, or any other reachable object) might use 'this', either explicitly or implicitly, then 'this' cannot be collected.

    This means that klhuillier's comment that "One reason why this would be inappropriate is if an object becomes a candidate for finalization, so might some of its referenced fields. This could leave the method that is currently running in a state where it will invoke methods on finalized objects." is false, because if your method might access those fields, then it needs 'this' around at least until it accesses the field, preventing the referrent of 'this' from being collected. Similarly, kojiishi's remark that "Given that, even if I'm using "using" pattern, GC could collect the object while Dispose() is executing, so I guess "using" isn't still safe." — as Joshua points out, the using statement's closing brace also implicitly refers to the IDisposable you're "using" against, and the .Dispose() method is subject to the same reachability rules that prevent "this" from being collected before you've read all the fields you're going to read.

    In short, *stop worrying*, people. The GC is not going to yank the rug out from under you, ever, unless you ask it to by writing a finalizer (which may run at a surprising time) or use a weak reference (which can be cleared at any time).

  25. Gabe says:

    "The garbage collector is a rogue thread that closes the thread at a bad time."

    My guess is that the thread closes the *file* at a bad time.

    Anyway, why's everybody hatin' on finalizers? Like "unsafe" and "volatile", finalizers are parts of the language you don't use unless you know you need to. But when you need them, you're glad that they're there. Odds are that's the only way to know when you need to flush an output buffer or rollback an abandoned transaction.

    [I meant "closes the stream at a bad time." Fixed, thanks. -Raymond]
  26. Chris Oldwood says:

    I ran into the issue of an object being finalized whilst another thread was executing a method on it a few months back and found an excellent article called "Lifetime, GC.KeepAlive, handle recycling" by Chris Brumme [blogs.msdn.com/…/51365.aspx].

    FWIW Here is my blog post on the issue – chrisoldwood.blogspot.com/…/object-finalized-whilst-invoking-method.html

  27. klhuillier says:

    Raymond,

    Thanks for the example. That was similar to what I tried. Your program didn't work for me either. Every time I get "returning from GCMe" before "finalized". I tried several different build configurations as well.

    Angstrom wrote:

    "[…] if your method might access those fields, then it needs 'this' around at least until it accesses the field, preventing the referrent of 'this' from being collected."

    Ah, that is an excellent point. I forgot to consider that all objects referenced by a local variable are considered root objects.

    [The fact that you wrote "build configurations" tells me you're running it from inside Visual Studio. Remember, the GC changes its behavior when run under a debugger. Do it from the command line. "%windir%Microsoft.NETFrameworkv3.5csc Program.cs" and then "Program.exe". -Raymond]
  28. klhuillier says:

    Raymond,

    You are correct, I was using Visual Studio. Running it from the command-line worked as described. Thank you.

  29. Joe says:

    So the 'this' reference isn't treated as a GC root? That seems absolutely crazy.

    [That's because you're thinking about garbage collection the wrong way. -Raymond]
  30. Joe says:

    No I'm not – briefly accepting your "simulating a computer with an infinite amount of memory" definition of GC, if the user can observe a situation where memory that they were using has dispappeared due to it being collected, then the simulation of infinite memory has failed.

  31. Mmmh says:

    @Joe

    If your code never refers to this, you are not using the memory and thus it can be collected, even if you are in an instance method.

  32. Joe says:

    If you're in an instance method, the code does refer to this. It's analagous to the as-if rule for C++ optimization – if you can tell that the optimization has happened, then the optimizer made a mistake. Likewise, if you can ever access a reference to an object that has been collected, the GC is buggy.

    Of course, it's arguable that the bug is actually the decision to provide finalizers at all.

  33. Mmmh says:

    @Joe

    > If you're in an instance method, the code does refer to this.

    Absolutely not.

    int Sum(int x, int y)

    {

     return x + y;

    }

    does not of course refer to this, even if it's an instance method. You need this to be valid to call it, but there is no side effect in reclaiming the memory during its execution (in fact in other languages – like C++ – you can call it without a valid this pointer! but it's an implementation detail I think).

    Even if your code uses a couple of members, it may still be considered not referencing this.

    For example

    int Sum(int x, int y)

    {

     return x + y + m_SomeReferenceType.GetSomeValueSomewhere();

    }

    could safely reclaim this memory, provided m_SomeReferenceType is not collected in the process.

  34. Joe says:

    @Mmmh: If the GC can deduce that the instance method doesn't actually uae the 'this' reference, of course it can collect it – that's the "as if" optimization rule. In a similar fashion, the GC can treat a local variable as not being a GC root if it can deduce that the variable will never be accessed again.

    Raymond's example demonstrates a bug in the CLR's GC – or a bug in the relevant language specifications depending on your point of view – because legal well defined code can access memory that the garbage collector has incorrectly determined is inaccessible.

Comments are closed.