A few questions and answers about using IDisposable


I wrote a little bit on Disposable objects and their uses in Chapter 5 of the Performance and Scalability PAG (look here and the following sections) and it triggered a lively discussion here at MS the other day.  Here’s the bit of interest and some of the discussion reduced to Q&A form:



“The reason you want to avoid finalization is because it is performed asynchronously and unmanaged resources might not be freed in a timely fashion. This is especially important for large and expensive unmanaged resources such as bitmaps or database connections. In these cases, the classic style of explicitly releasing your resources is preferred (using the IDisposable interface and providing a Dispose method). With this approach, resources are reclaimed as soon as the consumer calls Dispose and the object need not be queued for finalization. Statistically, what you want to see is that almost all of your finalizable objects are being disposed and not finalized. The finalizer should only be your backup.”


Q: The whole point of GC is to manage the lifetime of objects in a sensible way. Doesn’t this “guideline” (using IDisposable) suggest that objects in certain categories (with attached unmanaged resources of any kind) cannot participate in GC – and we require the client to manage it outside the GC apparatus? 


A: Yes that is basically correct. Finalizable objects generally require the client to take additional action, such as wrapping their use in a “using” statement.  If the client does not do this they will get sub-optimal performance.  It is not mandatory, however, so basically what you want to do is Dispose all the ones that can be readily Disposed because they have easily understood lifetime and let the GC handle just the exceptions, which hopefully are much more rare.
 
Q: Doesn’t this make it much harder to use classes that have a finalizer?
 
A: That has not been my experience. I have found that generally the developer that creates the class cannot say what the lifetime of instances will be but that it is most often a simple lifetime and that they should therefore provide a facility (Dispose) which allows the customer of their class to take advantage of the common case and get superior performance.
 
Q: Consider the case where the managed current implementation has an underlying unmanaged implementation, doesn’t that lead to you a design where there are unmanaged resources attached to virtually all objects?
 
A: This is an exceedingly bad situation to be in.  If you truly have finalizable state in all your objects you will swamp the finalizer thread and give the GC fits because all your objects will be living at least one generation longer than they otherwise would have had to live.  I would urge you to find a way to consolidate the unmanaged state into comparatively few objects which can then be recycled like (e.g.) database connections or something of that ilk.  If your objects are, on the other hand, not transient then you may be ok because recovery of the unmanaged resources is a rare event in any case.
 
The use of Dispose pattern is not advice that is lightly given.  You may be very disappointed with the performance you get if you follow another pattern and are not very careful.

Comments (23)

  1. David Levine says:

    The topic of Finalization and Dispose gets complicated pretty quickly – there are lots of variations. One of the problems we’ve run into is that you can’t really get very far without using some underlying system resource or legacy API (e.g. COM) that has destructor semantics (Open/Close, or Create/Destroy) associated with it.

    We’ve gotten to the point where we have our own internal guidelines for classes that need either a finalizer or a Dispose.

    1. If you need either one then you must implement both.

    2. The Dispose method calls GC.SuppressFinalize(this) after performing the dispose operation. We usually provide a method Dispose(bool disposing) for this; it seems to be almost a standard approach anyway.

    3. The finalizer calls the same dispose method.

    4. Clients consuming a class with a Dispose method are required to treat it as part of the contract; if it has a Dispose() method it must call it when the object is no longer needed.

    5. In the debug build the finalizer will Assert, throw an exception, print to the debug port, etc., so that developers are made aware that they created an object that they did not properly dispose of and for which they did not honor the Dispose contract. We don’t have a guideline yet for the release build, but I tend toward thinking that using a Trace statement is useful to provide breadcrumbs, on the expectation that it ought to happen rarely.

    6. Minimize the number of classes that need either a finalizer or a Dispose.

    As with all such rules, these are guidelines only – special cases require special handling.

  2. Mike Dimmick says:

    I’d agree with that but call GC.SuppressFinalize _first_, before doing anything else. I’ve actually seen (on Compact Framework) a finalizer be called when in the middle of executing that object’s Dispose method. I can only assume that the JIT stopped reporting the reference and a collection happened, causing the object to get onto the freachable queue.

  3. Rico Mariani says:

    David writes:

    >> 1. If you need either one then you must implement both.

    If you have a finalizer you should have Dispose but if you have Dispose you don’t necessarily need to be finalizable. Consider an object has finalizable members. It need not itself be finalizable because should it die its sub-objects go into the finalization queue directly. However it should be Disposable so that when Disposed it can in turn call dispose on its members.

    Re: 2&3. There’s a recommended pattern for Dispose methods that covers these. An example is in the PAG (see link in main article).

    Re: 5. That’s a great defensive mechanism but not always possible with objects that have complex lifetime. I like the Trace idea, it should be rare enough that it doesn’t cause a problem at that level.

    Re: 6. Amen.

  4. David Levine says:

    Mike,

    The argument I’ve heard in favor of calling GC after disposing the object is that if the dispose throws an exception the call to the GC will never be made so you will get a 2nd chance to cleanup on the finalize thread. This never made much sense because if it threw the 1st time, when other managed objects were still valid, there would be an even greater chance it would throw the 2nd time (on the finalizer thread) when there was no guarantee that managed objects were still valid. But all the examples I’ve seen from MS show it done this way.

    The issue of thread safety is a different concern. If the object can be disposed by two different threads (not the finalizer) then the disposed object itself should handle the threading issues to prevent multiple simultaneous calls to Dispose from causing problems. If its because the Dispose method may get called from an object at the same time the finalizer thread disposes the object, then a fix may be very simple.

    In the dispose method add a flag that is used like this…

    Dispose()

    {

    if ( !disposed )

    {

    DoSomeCleanup();

    this._disposed = true; // set a instance field

    }

    }

    This should cause the JIT to report a reference to the GC preventing it from getting finalized until the _disposed flag has been set. It still isn’t thread safe but it should prevent it from being put on the freachable queue prematurely.

  5. David Levine says:

    Brad,

    re: 1) Dispose but no finalizer

    I agree that you don’t always need a finalizer but I would argue that it probably should have one if it has a Dispose. I contend that Dispose should be treated as a contract requirement and that not invoking it should be treated as an error. The finalizer provides a mechanism that allows us to catch these omissions. It may be that it isn’t strictly neccessary and may add overhead, but if the object’s contract is honored the object will never get put on the freachable queue anyway.

    I would also argue that users of a class should not have incestuous knowledge of its inner workings – they should not "know" that it’s ok to not invoke the finalizer. It’s implementation can change at any time and invalidate those assumptions.

    But in the end, it’s only a best practices guideline, not a requirement.

    regards,

  6. Rico Mariani says:

    Dispose()

    {

    if ( !disposed )

    {

    DoSomeCleanup();

    this._disposed = true; // set a instance field

    }

    }

    This sort of thing isn’t necessary the "this" pointer is necessarily live while any member is running including DoSomeCleanup — all manner of disasters would ensue if that was not the case.

    Sometimes people do see finalization happening concurrently with disposing. This isn’t a normal situation but it is possible with exotic kinds of cleanup — people often call the standard Dispose of another object in their finalizer — this is generally a bad idea because the object may itself have already been finalized though it is still live via your object.

    Remember all the object members of a finalizable object remain live but finalization order is not guaranteed so your members may have already been disposed or finalized. It’s best to ignore them entirely in your finalizer and only release unmanaged state you directly own.

    You avoid this problem entirely if you follow my guideline of never having a finalizable object have any state other than the unmanaged state which it owns (i.e. make it a leaf object).

    When Disposing this is not the case, you should Dispose your members if any. Keeping in mind that if you have members to dispose, and you’re following my guideline, then all you are doing is holding disposable objects and you yourself would not be finalizable because you have no unmanaged state of your own to clean up.

    There is more detail on this in the PAG at this link

    http://msdn.microsoft.com/library/en-us/dnpag/html/scalenetchapt05.asp?frame=true#scalenetchapt05_topic13

  7. Andy says:

    Is there any way to detect if a developer forgot to call Dispose on an object that implements IDisposable without having a finalizer as someone else suggested? Can FXCop do this?

  8. Rico Mariani says:

    Can’t do it in general via static analysis (ala FXCop) but maybe some of the more common cases could be handled that way.

  9. David Levine says:

    Rico,

    re: finalizers running while in a instance method.

    Are you sure that it is not necessary to ensure the "this" reference is kept alive? My understanding of how the JIT reports references to the GC may be flawed or incomplete.

    The information I had stated that executing an instance method was not enough to ensure that the JIT would report that instance to be alive.

    Chris Brumme’s article on this <http://blogs.gotdotnet.com/cbrumme/permalink.aspx/e55664b4-6471-48b9-b360-f0fa27ab6cc0&gt; indicates that an instance is eligible for collection even while executing an instance method.

    In other words, given the statement:

    new SomeObject();

    If the reference to the newly created SomeObject is not used (e.g. not stored in a root) then it can be collected while its constructor is still running.

    Am I reaching an invalid conclusion here, or has the implementation changed so that this is no longer the case?

  10. Rico Mariani says:

    I’ll talk to Chris about it, I think it’s worth getting a definitive answer out there.

    I can say this much though, and this is an important special case.

    If you write this (as you wrote above):

    new SomeObject();

    The code does not have the new object pointer until after the constructor has finished running. Whether you then store it afterwords or not cannot make a difference as to what happened during construction. If the JIT reported a member variable or local of your method as a live root that would be all fine and well but that root couldn’t possibly be *holding* the value during the construction anyway because it doesn’t yet have the new object reference.

    So either the constructor itself, or else the CLR in the context of creating the object must necessarily keep the object alive otherwise if a colletion was triggered your object would go away.

    Speaking very broadly, there are many complex lifetime issues like this and if you had to think about any of them the whole model would be a disaster. It has to "just work".

    Now you can imagine cases where even though a method was running that the "this" pointer is no longer reachable. If there is truly such a case then collection is perfectly fine, you can’t get to the object anymore anyway so it’s impossible to know if it’s gone as a practical matter anyway. If someone else can access the object then by definition its still reachable and it won’t be discarded even though the member can’t reach it.

    I do not see the point of code like this:

    DoSomeCleanup();

    this._disposed = true; // set a instance field

    Either DoSomeCleanup() needed the this pointer or it didn’t. As long as it needed it then it would be live. The normal lifetime management will do the job just fine.

  11. Rico Mariani says:

    Another quick followup, I read through Chris’s article, his issue was with regard to the *unmanaged* state and the cleanup implications.

    So I think now I can safely ammend what I’ve written.

    Lifetime extension tricks like this one:

    this._disposed = true; // set a instance field

    may be useful if you need to force the lifetime an object to be extended so that it for sure doesn’t go away before the unmanaged resource it controls.

    Important: when I wrote this

    ‘The "this" pointer is necessarily live while any member is running’

    That was wrong (!)

    "this" is generally reachable for the whole method but it is possible that it might become unreachable and then you could have issues with *unmanaged* state as Chris wrote. As for managed state, you can’t tell the difference which is why in the usual cases it "just works".

  12. David Levine says:

    I’ve been thinking some about this and I came up with an example of how a constructor can run at the same time as the finalizer.

    class MainClass

    {

    static void Main(string[] args)

    {

    Console.WriteLine("Creating object");

    new MainClass(); // don’t save the reference

    Console.ReadLine();

    }

    public MainClass()

    {

    Console.WriteLine("MainClass.ctor calling collect");

    GC.Collect();

    Console.WriteLine("MainClass.ctor done");

    }

    ~MainClass()

    {

    Console.WriteLine("MainClass Finalizer");

    }

    } // MainClass

    A release version produces this output…

    Creating object

    MainClass.ctor calling collect

    MainClass Finalizer

    MainClass.ctor done

    This shows that the finalizer ran before the constructor had finished running.

    A debug build produces…

    Creating object

    MainClass.ctor calling collect

    MainClass.ctor done

    From what I’ve read, in a debug build all references are reported as live until the end of a method body (to make debugging easier), so the different outputs makes sense.

    This is a contrived example but it does demonstrate that unless there is explicit code that keeps an object reference rooted it can be collected regardless of what is actually executing. I don’t expect many people to run into problems because of this (it’s definitely an edge case) but it does demonstrate that there are subtleties to the GC, and that there may be some differences in behavior between debug and release builds, especially differences related to memory management.

    This might also affect code written like this…

    new SomeObject().SomeInstanceMethod();

    It might also bite people who put their entire application into their constructor (I’ve seen some lo-o-ong constructors).

    I’ve read Chris’s revised blogs on lifetime issues, handles, and finalization, and I haven’t really processed all of it yet – there’s a lot there to digest – but I didn’t see anything coming in Whidbey that will invalidate the results here.

    Thanks for the attention you’ve given this and for digging into it.

  13. Rico Mariani says:

    This gets curiouser and curiouser.

    I think there’s a problem here but I’m going to have to do more digging to be sure. I don’t think the debug version of the code enters into it. I can make the debug version exhibit the same behaviour.

    I changed your test case a little bit to force the issue and to illustrate that storing the reference makes no difference as it is by then far too late.

    Here is the altered version:

    using System;

    class MainClass

    {

    static void Main(string[] args)

    {

    Console.WriteLine("Creating object");

    MainClass mc = new MainClass(); // this doesn’t save you from getting a finalized object (!?!)

    Console.ReadLine();

    }

    public MainClass()

    {

    Console.WriteLine("MainClass.ctor calling collect");

    GC.Collect();

    Console.ReadLine();

    Console.WriteLine("MainClass.ctor done");

    }

    ~MainClass()

    {

    Console.WriteLine("MainClass Finalizer");

    }

    } // MainClass

  14. Rico Mariani says:

    I’m telling you blogs are so educational, everyone should have one :)

    When I saw your example I went into Red Alert Mode because I thought it was hopeless for the poor caller to arrange to always get a valid object in these cases because there is no way for the caller to ensure that a reference is reported for the duration of the .ctor because he has no reference to the object yet. Lacking that reference it wouldn’t matter at all what you did after the .ctor, whether you use the object or not you can’t report a reference to it so you’d be doomed to have a chance of a broken finalizable object.

    The way I thought this worked was that while the .ctor was running the VM reported a reference to the pending object and so an object could never be cleaned up while it was being constructed which would neatly avoid the problem but it would have the undesireable effect of having .ctor invocation being slightly magical in that there is an extra burden there when making such a call.

    It turns out that the real solution is a little bit different and it neatly matches what we’ve been seeing. So, on to the explaination.

    The reason this hasn’t been ruining everyone’s life is that contruction of the object happens in two parts. First the memory is allocated by a helper, this interim state is not visible via normal managed methods etc because always the next thing we do is call the .ctor. Now what’s good about this is that once the allocation is complete the function that invokes the .ctor can report a reference to the as-yet-unconstructed object just like any other object and normally it does exactly that.

    Here’s annotated x86 that is generated in a more normal case where the object lives and why it works. The two phases are readily visible. I have changed the example so that the new reference is stored in a static variable which keeps it alive.

    IN0000: 000000 push ESI

    // check for stdio initialization

    IN0001: 000001 cmp gword ptr [classVar[0x5b9dc6e8]], 0

    IN0002: 000008 jne SHORT G_M001_IG04

    // initialize console

    IN0003: 00000A mov ECX, 1

    IN0004: 00000F call System.Console.InitializeStdOutError(bool)

    G_M001_IG04:

    // write the ‘Creating object’ message to the console

    IN0005: 000014 mov ECX, gword ptr [classVar[0x5b9dc6e8]]

    IN0006: 00001A mov EDX, gword ptr [08AD1054H] ‘Creating object’

    IN0007: 000020 call dword ptr [(reloc 0xde30014)]System.IO.TextWriter.WriteLine(ref)

    //allocate the MainClass object

    IN0008: 000026 mov ECX, 0xdd40e10

    IN0009: 00002B call ALLOCATE_MEMORY_HELPER

    IN0010: 000030 mov ESI, EAX // save the new object in ESI

    // ESI is a reported GC root at this point, preventing disaster

    // set up for the .ctor call it’s just a regular method call like all other methods

    IN0011: 000032 mov ECX, ESI

    IN0012: 000034 call [MainClass..ctor()]

    // get static storage for MainClass and write the reference

    IN0013: 00003A lea EDX, bword ptr [classVar[0xdd40d3c]]

    IN0014: 000040 call ASSIGN_REF_ESI_HELPER

    // wait for a line of input from the console

    IN0015: 000045 call System.Console.get_In():ref

    IN0016: 00004A mov ECX, EAX

    IN0017: 00004C call dword ptr [(reloc 0xde3002c)]System.IO.TextReader.ReadLine():ref

    IN0018: __epilog:

    000052 pop ESI

    IN0019: 000053 ret

    Given the way I thought about this previous to about a half hour ago you can see why I would be alarmed by the behaviour of your example. My expectation would have been that it would not matter a whit whether I stored the new object in a static variable or a local one because even if I report the static or local since the .ctor hasn’t returned yet my variable doesn’t hold the actual object reference.

    However, because our model is that the allocation is seperate from the construction in the generated code (but not the IL) all you have to do is use the object and everything works out great.

    So now there’s two cases where things might be wierd. Let be dispose of those.

    1) My object is finalizable but has only managed state

    When you construct the object either it is live afterwards or it isn’t. If it is live then there is no chance that it will be finalized while the .ctor is running. If it is not live then we must have reached a point in the execution where the object is reachable from neither the .ctor nor the calling function. Since the object is unreachable the caller cannot observe the finalization and neither can the code in the .ctor, everything will look "normal" to all the parties. Modifications to the object are also necessarily finished so the object will look just as it will when the .ctor is finished so the finalizer also sees everything just as it should be.

    2) My object is finalizable and is wrapped unmanaged state

    As discussed earlier in this blog, when writing unmanaged wrapper objects it is important to arrange for the lifetime of the object to exactly match the lifetime of the underlying unmanaged resource. To accomplish this it is often necessary to use GC.KeepAlive(this). The .ctor must follow the same rules as every other method.

    I knew this was subtle but it’s even more subtle than I imagined.

  15. David Levine says:

    The mental model I’ve been using is similar to yours…when an object is newed an allocator is called which allocates the memory on the heap, a reference is returned, and then the .ctor is invoked, and that once the memory allocator has returned an object reference it is immediately available for collection. In other words, it starts life with the same degree of aliveness as any other object.

    I’ve been looking at the assembly listing and I’ve a small concern. On line IN009 it calls the memory allocator and on line IN0010 it stores the returned reference in ESI; if a collection was triggered after that point this should be enough to keep the object reachable since ESI is a GC root. However it isn’t clear to me what keeps the reference alive if a collection is triggered after IN0009 has been executed but before IN0010 begins. Unless there is a guarantee that the code cannot be interrupted between these two lines then this is a race that eventually we will lose. I wonder if there is a low-level lock that prevents this, or perhaps an implicit KeepAlive that keeps it alive until after the .ctor has run. I am curious.

    I agree with your conclusions with one minor addition. There may be code in the .ctor and finalizer that is not object instance related but which uses static methods on other classes or calls Win32 APIs. Things might get confusing if both the .ctor and finalizer ran simultaneously and executed conflicting code. In other words, special cases require special care. This is similar to your second conclusion – in this case the object doesn’t wrap an unmanaged resource but it does wrap an execution stream, and to prevent problems the lifetime of the object should match the requirements of the execution stream.

    Another aspect I find interesting (and which relates back to original topic) is that all the examples I’ve seen have the Dispose method written like this…

    public void Dispose() // IDisposable implementation

    {

    Dispose(true); //

    GC.SuppressFinalize(this);

    }

    One question that came up earlier is how could the Finalizer method run at the same time as the Dipose method; we’ve seen how that is possible. Given the two lines of code above it is also clear that there is a race between the two lines so that the Finalizer may be called after the Dispose method has executed but before the call to SuppressFinalize removes the object from the freachable queue.

    I prefer reversing the order of the code…

    GC.SuppressFinalize(this);

    Dispose(true);

    This should eliminate the race and also prevent the Dipose method and finalize method from ever executing simultaneously. It might also be less error prone for the average developer because of the implicit elimination of the race and double execution of the dispose method.

    The only argument I’ve ever heard for putting them in the other order is that if the Dipose method throws an exception the finalize will still run later because the SuppressFinalize will not have been called. I don’t consider this to be a valid reason because I would expect it to be even more likely to throw an exception when invoked a second time from the finalizer; if there is code that needs to exception resilient then it should be in the method body itself. This arrangement might also result in a small performance win by removing the object from the freachable queue a little sooner.

    Yes, there are some subtleties here!

    Regards,

  16. Rico Mariani says:

    Hmmm, I’m not sure doing the GC.SuppressFinalize(this) first actually removes the race, though it does further restrict the window. I’m pretty sure the race is still there. I think the real solution to that problem is to use GC.KeepAlive(this) for those rare cases where there is a lifetime problem.

    So the reason for doing GC.SuppressFinalize(this) first or last then should be at what point you feel it’s safe for the finalizer to not run. I think our guidance has been that it’s generally safer to let the finalizer run until you’re sure cleanup is complete. Which argues for doing the supress last.

    When you consider the possibility that there could be derived classes in the picture which want to Dispose their own state and then call the base class Dispose method you rapidly conclude that it must be the first class that introduces Dispose which must do the supress and since it is last to be call no semantic other than suppress last has much hope of being achieved.

    Well, that’s how I see it from here anyway :)

  17. Jonathan Perret says:

    I don’t think there’s a problem with putting GC.SuppressFinalize(this) at the end of the method.

    GC.SuppressFinalize(this) implies GC.KeepAlive(this) (AFAIK GC.KeepAlive is just an empty method with a convenient name).

    In other words, there is no way the object can be eligible for finalization before execution reaches GC.SuppressFinalize(this), since this call requires access to "this" and therefore functions as an "anchor" that keeps it alive until that point.

    A related point regarding Rico’s "modified" example with the line :

    MainClass mc = new MainClass(); // this doesn’t save you from getting a finalized object (!?!)

    As far as I understand, just adding "MainClass mc=" in front of the allocation does not by itself change anything in the semantics of the program as passed by the compiler to the GC.

    As long as you don’t refer to ‘mc’ further, the variable can be safely removed.

    Chris Brumme’s weblog is indeed a great read to understand this kind of problems…

  18. Akshay Kumar says:

    What happens if we call Dispose multiple times on an object.

    I tried it on SqlConnection and no exception etc was thrown.

    the reason I ask is b’cos I have written a wrapper for an object which provides dispose method and expose that object as a property.

    Now developer can call dispose method on underlying object.

    Also I implemented Dispose method and finalizer, so that if developer forgets to call dispose method on underlying methods then it gets cleaned up.

    So what are the performance implications.

  19. Rico Mariani's WebLog says:

    Two quick hrefs