Dispose Pattern and Object Lifetime [Brian Grunkemeyer]

The Dispose pattern is the way to think of object lifetime in the .NET Framework.  Admittedly, it can be a little subtle.  A customer asked a question on our MSDN documentation for implementing the Dispose pattern.  I’ll get to this question, but let’s review some basics.

Basics of Disposing, Finalizing, & Resurrection

The Dispose pattern exists to help impose order on the concept of object lifetimes.  You would naively think that object lifetime is relatively trivial, but there are some rather daunting subtleties.  Fortunately, the Dispose pattern will help lead the way.  The basics here are assumptions that need to be agreed upon by library authors, developers using libraries, and language designers, so it’s important that everyone is on the same page.  Perhaps in another world this could have been designed differently, but we don’t live in that world.

First, here’s a restatement of the Dispose pattern (though you can find more in the Framework Design Guidelines, which were excerpted in this blog post from Joe Duffy).  A disposable type needs to implement IDisposable & provide a public Dispose(void) method that ends the object’s lifetime.  If the type is not sealed, it should provide a protected Dispose(bool disposing) method where the actual cleanup logic lives.  Dispose(void) then calls Dispose(true) followed by GC.SuppressFinalize(this).  If your object needs a finalizer, then the finalizer calls Dispose(false).  The cleanup logic in Dispose(bool) needs to be written to run correctly when called explicitly from Dispose(void), as well as from a finalizer thread.  Dispose(void) and Dispose(bool) should be safely runnable multiple times, with no ill effects. 

This pattern is part of the platform, and languages like managed C++ have assumed that library writers follow this pattern (mostly) correctly.

Next, let’s review the basics of how objects live & die, so we can avoid some unfortunate confusion that comes up later.  If you don’t know what finalization is, read this finalization intro on Maoni’s blog.  There are two distinct operations that often overlap — the lifetime of the object (ie, when it is in a usable state), and the duration of time that the GC commits memory for an object.  In a normal finalizable object’s lifetime, the GC commits memory for an object, then the CLR runs the object’s constructor, passing in the newly-committed memory as the “this” pointer for the object.  Note that the usable lifetime of the object is a subset of the lifetime for the committed memory in the GC heap.  Usually, developers think of this memory committing & constructor running as an identical operation.  These can be easily merged if you’re coming from C#, Visual Basic, and Java, because in those languages, there is no way of disentangling the two.  C++ is more interesting, allowing you to reserve some memory on the stack, then run a constructor on that block of memory using the placement new operator.  (Also, the managed String class uniquely uses a different calling convention — we run a constructor which then computes the length of the String instance necessary to hold data, allocates the memory, then it returns the new instance as the “this” pointer.)  Merging these two concepts is a perfectly acceptable simplification for constructing object instances in most languages, but the same doesn’t hold true when you free objects.

The end of the object’s usable lifetime, according to our Dispose pattern, is when the user calls the Dispose(void) method.  Then, at a later point in time, the garbage collector will detect that there are no outstanding references to an object, and it will try freeing the memory.  But first, the GC provides the object with an opportunity to clean up resources, called finalization.  This is a backstop to ensure that resources are freed if someone did not explicitly call Dispose(void), to ensure that the object’s lifetime is correctly terminated before we release memory.  This isn’t necessarily where a programmer intended to end the lifetime of an object.  But the awkwardness runs deeper.

Finalization is fundamentally different from ending an object’s lifetime.  From a correctness point of view, there is no ordering between finalizers (outside of a special case for critical finalizers), so if you have two objects that the GC thinks are dead at the same time, you cannot predict which finalizer will complete first.  This means you can’t have a finalizer that interacts with any finalizable objects stored in instance variables.  Also, finalization happens on a completely different thread, sometimes at a different priority level.  In future versions, perhaps the GC will require multiple finalizer threads, running your finalizers in parallel with themselves.  Some managed hosts (like SQL Server) do not allow users to define finalizers on their types.  Chris Brumme included a more complete list of restrictions, limits & surprises in his finalization blog post.  Reading through this might help you understand an obscure stress bug.

Additionally, both normal process exit & appdomain unloading complicate the picture for finalizers.  As you know, an application domain is essentially a process within a process, and each appdomain gets a separate copy of static variables.  When we unload appdomains, at some point, finalizable objects stored in static variables must be garbage collected.  At this phase during appdomain unloading, your finalizer cannot take a dependency on other finalizable objects, because all the finalizable objects reachable by static variables might be finalized.  Every method call might throw an ObjectDisposedException, or in a pathologically poorly written set of classes, stuff just doesn’t work right in weird ways.  Process exit should conceptually be similar to unloading all appdomains (but is subtly different — there’s no appdomain unload event) and runs into the same issue with statics being finalized.  Keep reading below for a solution.

There’s a complication to when the GC can release memory.  It’s possible that an object’s finalizer might store a reference to an object somewhere else in the GC heap, potentially even in a live object.  If so, during the next GC, the committed memory is still reachable from live GC roots, so the GC cannot release the memory.  This is called “resurrection”, where an object instance is raised from the dead to haunt the living with potentially inconsistent state.  Additionally, it’s possible that the finalizer might run again on the same instance, if someone called GC.ReRegisterForFinalize(). 

It should be obvious now that the Dispose(bool) method has two unrelated functions — ending an object’s lifetime, and a last-ditch attempt at ending an object’s lifetime in a more constrained environment.  Since the finalization logic is supposed to live in the Dispose(bool) method on the code path where the parameter is false, then it may be convenient to talk about finalization code to encompass both code in finalizers as well as in the Dispose(false) path. 

Developer Knowledge Gaps

Now, let’s talk about where the above really hurts people.  One example I’ve seen somewhat commonly in the .NET Framework is the assumption that the constructor for an object always completes successfully.  This is not true for two reasons.  The first is the obvious case where a constructor checks some precondition (such as whether a parameter is null) then throws an exception.  Most people write their finalization code to check one variable to see if it’s initialized, and if so, then clean up all the state in their object.  They happen to luck out usually in this first case, but not always (I’ve seen code that dereferences pointers that can be null without checking first).  The second reason is more subtle — asynchronous exceptions can occur basically between any two machine instructions in a managed method body, including constructors.  So it’s possible to initialize 2 of the 5 variables in your type, get a ThreadAbortException, then your finalizer runs.  This obviously doesn’t work.  Your finalization code needs to be more defensive than this.

Speaking of other reasons to be defensive, I mentioned above the appdomain unloading & finalizers don’t interact particularly well, producing the restriction that finalizers shouldn’t rely on static variables that may use finalizable objects.  This may not be practical in all scenarios, or you’d like to make an attempt at something anyways, such as writing to a log file.  There is a predicate exposed in the BCL to help you — Environment.HasShutdownStarted.  It exists solely to allow finalization code to figure out if they can depend on static variables.

Resurrection is not something most developers plan for.  Resurrection can cause some extremely wacky behavior, and thinking about it will hurt your head.  Trust me, I know.  Resurrection is the best reason to defensively add checks along every public entry point to a type that ensure an object is not disposed.  Thread safety is a close second reason to explicitly checking for a disposed state. 

In case people have missed this, SafeHandle is an excellent tool, giving you correctness benefits in addition to better reliability.  Please use it when accessing native resources. 

In a future version, we might consider adding a public IsDisposed predicate on disposable objects, to serve as a publicly consumable state flag usable in preconditions, as part of a much larger effort.

What Did Our Customer Want to Know?

Now that I threw all of that information at you, let’s get back to the original motivation for this post.  A customer read our MSDN docs for the Dispose pattern and was confused by the sample code.  That customer asked if we could clarify the example, and it requires three pieces:

  1. A simple wrapper class that exposes an unmanaged resource
  2. A subclass of a disposable type
  3. A class that wraps a disposable type

What’s in our MSDN Documentation Today?

I’m happy to report that we have a pretty good example of the first, though perhaps not quite in the place you’d like to see.  My “How to use SafeHandle” blog entry contains a good example in the middle — look for the type named SafeHandleDemoV2.  Our MSDN documentation includes a sample using IntPtr to represent native resources like handles & memory.  That works, but we’d prefer that you use SafeHandle for this purpose.  Use SafeHandle to ensure your libraries don’t leak resources, both to ensure long-running servers stay up & running, as well as to fix some relatively obscure security concerns. 

For the second item, the MSDN documentation has a sufficient example showing how to derive from a disposable type — see the MyResourceWrapper class.  The key part of the example is showing how to override Dispose(bool), and importantly, to call the base class’s Dispose(bool) method at the end. 

What Should We Add?

So, what did we miss from our MSDN documentation?  Finalization is not free, and it’s a feature that we don’t want spread throughout libraries for no good reason.  One curious part of our sample on MSDN is the base class defines a finalizer, which of course calls Dispose(false).  But does the finalizer need to exist on the base type?  The sample code is admittedly contrived, so the base type actually allocates native resources & requires a finalizer to serve as a backstop for people that didn’t call Dispose(void).  This is realistic sometimes, but suffers from two problems that should mean this isn’t a common case.  First, the code isn’t reliable — if it did use SafeHandle, then SafeHandle’s critical finalizer would be sufficient to free the underlying resource, and you could remove the finalizer from the BaseResource class.

The second reason why a disposable base type often doesn’t include a finalizer is that they sometimes aren’t necessary, at a certain layer of abstraction.  One common pattern is using an abstract base class, like Stream.  In that case, the base class makes no policy decisions about how data is represented, and as such, isn’t capable of determining whether a finalizer is needed.  Instead, individual subclasses should figure out whether a finalizer is needed, and if so, add one that calls Dispose(false).  For example, MemoryStream uses a managed byte[] internally, so it doesn’t need a finalizer to release any resources.  If Stream defined a finalizer, then MemoryStream would be finalizable, meaning you’d pay a little unnecessary perf penalty every time a MemoryStream object went dead (unless it was explicitly disposed). 

In terms of other information we should include in our MSDN documentation, pointing out the restrictions on finalizers is important when writing your Dispose(bool) method.  The lack of ordering among finalizers really hurt finalization code’s usefulness, while the shutdown issues mean you need to at least cognizant of when you must use Environment.HasShutdownStarted. 

For resurrection, the best thing we can mention is to ensure public entry points enforce the precondition that the current object instance has not been disposed.  Again, we should also remind people about SafeHandle.

Additionally, the Dispose pattern needs to be followed consistently.  We have some types in the .NET Framework where they do not properly follow the Dispose pattern.  While we’d like to fix them, we didn’t fix some of them in our previous release, often for schedule reasons.  The reasons why we need consistency are first & foremost, that languages (like managed C++) can and do take a dependency on the Dispose pattern.  Implementing the pattern correctly is critical for anyone subclassing your type, and the C++ compiler will convert the normal C++ idiom of a destructor into Dispose(true) code.  The compiler emits a lot of plumbing for developers, and this must be done in a sensible way. 

The second motivator for consistency is the consequences of failing to consistently expose object cleanup to subclasses.  If a base class uses Close (or a virtual Dispose(void) method) for its cleanup code then one subclass uses Dispose(bool), getting the wiring right between the methods is a little tricky.  With three subclasses each on different plans, you can either skip cleanup logic or run it multiple times, and it isn’t possible to disentangle the web.  Just follow the pattern — you’ll be better off for it later, and you won’t have to write graphs of various hypothetical subclass chaining rules between multiple conflicting rules.  I fixed up Stream to no use Close before .NET 2.0 shipped, and I had to review 60 subclasses of Stream within the Developer Division’s code base.  It was not pleasant.

So, where was your third item on the list above?

Our user wanted to see an example of a subclass that uses a disposable resource.  So here’s how I would write that type.  I’ve included a syntax error here, so you don’t forget to put in your own resource type for the instance field below.

// General-purpose skeleton of how a class should use a disposable resource.

// Real-world examples include StreamWriter, which wraps a disposable Stream.

// If you cut & paste this sample, replace the second “IDisposable” below with

// a real type that implements IDisposable. Also, consider whether you need to

// lazily initialize the resource, or whether allocating it in the constructor

// is sufficient.

// For more info on how to use SafeHandle in an IDisposable wrapper type like

// this, look at the SafeHandleDemoV2 on the CLR Base Class Library team’s blog:

// https://blogs.msdn.com/bclteam/archive/2006/06/23/644343.aspx

public class UsesADisposableResource : IDisposable

{

    private “IDisposable” _resource; // The field’s type should be some useful

    // type that implements IDisposable, such as Stream, TextReader, Process, EventLog, etc.

            private bool _disposed;

    // Invariant: This instance of UsesADisposableResource is disposed iff

    // _disposed is true. _resource may be lazily initialized or allocated in

    // the constructor, but its lifetime does not exceed the lifetime of

    // UsesADisposableResource.

 

    public UsesADisposableResource() {

        _resource = ...; // Often people initialize a resource here, or wrap

        // some disposable resource passed as a parameter to the constructor.

        _disposed = false;

    }

 

    public void Dispose() // Note that Dispose(void) is public and non-virtual

    {

        Dispose(true);

        GC.SuppressFinalize(this); // In case a subclass adds in a finalizer    

    }

 

    protected virtual void Dispose(bool disposing) {

        // Note: If you need thread safety, use a lock around these operations,

        // as well as in your methods that use the resource..

        if (!_disposed) {

            // Dispose of the underlying resource, but only if we’re eagerly

            // disposing. If we’re finalizing this instance, then the underlying

            // type might get finalized before this instance (due to a lack of

            // ordering among finalizable objects). The only exception to this

            // rule would be critical finalizable objects like SafeHandle, where

            // there is a very weak ordering: critical finalizable objects are

            // finalized after normal finalizable objects that the GC detects

            // are unreachable during the same collection.

            if (disposing) {

                if (_resource != null)

                    _resource.Dispose();

            }

            // Additionally, if we’re finalizing, ensure that we don’t rely on

            // static variables, or during appdomain unloading, we might find

            // that our static variables point to finalized instances of objects!

            // We can protect ourselves from that possibility by never touching

            // static variables in our cleanup logic if Environment.HasShutdownStarted

            // returns true. Most people never have the need to touch disposable

            // static variables during finalization, so it is easy to overlook this restriction.

            _resource = null;

            _disposed = true; // Indicates this instance has now been disposed.

        }

    }

 

    // Precondition: This instance must not be disposed.

    public void DoSomethingWithResource() {

        if (_disposed)

            throw new ObjectDisposedException();

        // If you are lazily initializing _resource, then ensure that _resource

        // has been initialized at this point, by calling a helper initialization

        // method.

 

        // Do something interesting here, like read or write to the underlying resource

    }

}

I hope this helps everyone understand the Dispose pattern & object lifetime better.  Here’s a set of related links: