FAQ: Why does FxCop warn against catch(Exception)? – Part 3 [Nick Guerrera]


This is the third installment in a three-part series on why FxCop warns against catch(Exception):
FAQ: Why does FxCop warn against catch(Exception)? – Part 1
FAQ: Why does FxCop warn against catch(Exception)? – Part 2
FAQ: Why does FxCop warn against catch(Exception)? – Part 3


I said from the beginning that this issue is controversial, and some of your feedback certainly confirms that. I want to make it clear that I respect everyone’s right to disagree with me, to exclude or disable the FxCop warning, and to implement a less rigid exception policy in their application code as they see fit. Furthermore, all of the opinions I’ve expressed are entirely my own and they represent a stricter interpretation of the Framework Design Guidelines than absolutely required. I believe strongly that the approach that I’ve outlined helps to find and fix defects sooner during development and testing and therefore helps to deliver more robust production software. Nevertheless, the choice to adopt a different strategy for your application is entirely yours.


 


On the other hand, as I mentioned in the previous post, this FxCop rule is of paramount importance for class library code. When class libraries call back to application code (via virtual methods, interface methods, events, or delegates), it is disastrous if they swallow or disguise the arbitrary exceptions that may be raised. (To see just how aggravating this can be for library consumers, witness the frustration with the WebBrowser control that readers expressed earlier.) Furthermore, class library code cannot make any assumptions about the implications of arbitrary exceptions for application code further up the call stack, so it must let them go so that the application can decide how to handle them.


 


With that said, let me now address the most recent comments with more of my own personal opinions. 🙂


 


 


Jeremy: “… I hope you do realize that a number of the recommendations you have given to people, myself included, basically amount to ‘let your app crash, your customers and business partners be damned.’ …”


 


No, I most certainly do not realize that, because it’s simply not true. Applications which catch only the precise exceptions that they are prepared to handle gracefully are easier to develop and maintain over the long run. When this strategy is combined with strong testing, the application will ship with fewer defects and there will be very few unhandled exceptions in the wild. The truth is that once an unexpected exception has been raised in your application, customers and partners are likely to suffer one way or another. The impact of corrupt program state can range from security vulnerability to plain incorrect output, which can end up costing customers more time and money than dealing with a crash.


 


Please don’t misconstrue this to mean that it’s ok to write sloppy software which crashes regularly in common cases. Clearly that’s unacceptable and I never said otherwise. However, the right way to avoid this fate is to design your software carefully, implement it methodically, and test it thoroughly.


 


 


Jeremy: “I think all of your readers know full well that applications need to be designed to fail fast and do so clearly in order to ease the diagnosis and correction of software defects.”


 


I’m glad that we agree that applications should be designed to “fail fast.” From my point of view, once an exception is raised in your application and you haven’t decided exactly how to handle it on its own terms, then the application has already “failed” and process termination represents the “fast” part of the equation. 🙂


 


 


Jeremy: “We do this, however, by modularizing software and testing it at an appropriate level of granularity, not by letting the whole app fall down in production.”


 


I do agree that it’s possible to sandbox components (for example, by using separate app-domains and ensuring that all shared data is immutable) such that the failure of a single component does not imply failure for the entire process. This then becomes, as you say, a question of granularity. You can think of each component as a mini-process, and then apply all of my arguments on a component-by-component basis. That is to say that when an isolated component throws an unexpected exception, its state must be discarded and it must be reloaded before it can reliably be consumed again. If your application is hardened in this way, then you would in fact need to catch (Exception) in all of the locations that you mentioned in your first comment in order to save the host process from termination. I missed that in my original reply and I apologize.


 


However, I’d like to point out that it’s difficult and costly to implement this strategy correctly and I still maintain that the approach I’ve been recommending is perfectly viable for most applications.


 


 


Jeremy: “Your recommendations will only work well for the people replying to your posts if you are interested in personally supporting their products when deployed in the field.”


 


Given the choice between an application that abuses catch (Exception) and another which catches only the exceptions that it can handle gracefully, I’d choose to support the second application in a heart-beat.


 


 


 


Jeremy: “Their customers and partners don’t care about what FxCop might have recommended to the developer.”


 


Not directly, but they do care about software quality, reliability, and overall correctness. Let’s take two familiar examples: Microsoft Excel and the C# compiler. I don’t want Excel to crash, but it would be much worse if it ever produced an incorrect value in a spreadsheet that I use to manage my personal finances. Similarly, I prefer to see the compiler ICE (Internal Compiler Error) in favor of emitting bad code. Of course, in an ideal world, Excel would never crash and the compiler would never ICE, but I’ve seen both happen and I’m sure it was in my best interest even if it upset me at the time.


 


 


Niall: “I think you have to realise that exception handling is not just for the developer’s benefit. The application is there to serve the user, not the developer, so simply bombing out because you can’t display a new form is going to require the user to restart the application more than is really required.”


 


Yes, the application is there to serve the user, not the developer and I couldn’t agree more. In that situation, you have to ask yourself why you couldn’t display the form. If the only reason you can provide is that your catch (Exception) block was hit, then you really have no idea whether it’s in the best interest of the user to continue executing. For example, what if one of your assemblies just failed to load, or you’ve run out of memory, or the root cause is actually a corruption in shared data? What will go wrong next? On the other hand, if you catch (IOException) while trying to read the CSV file that will ultimately be used to populate the new form, then you know beyond a shadow of a doubt that the correct application behavior is to inform the user that the file could not be opened and allow them to continue to use the application which remains in a consistent state.


 


 


Niall: “In a previous project I worked on, we had a lot of places where exceptions were caught and error reports were sent back to the dev team. Therefore, the exceptions weren’t just disappearing quietly, they were known to the developers.

In many circumstances, the application would then continue. This is because a lot of our handling of exceptions would result in the form the exception came from closing or not appearing. Basically that element of the workflow did not occur. For almost all cases, this left the application in a state that allowed it to continue without problems. Of course, it wasn’t 100% perfect. We did have a threshold of a certain number of exception reports in a certain amount of time would cause the handler to exit the app, though.”


 


Both exception logging and using an exception count threshold are popular approaches for applications which catch all exceptions in places. In fact, this is actually exactly how the current version of FxCop works. Our experience with FxCop’s current exception handling policy has not been positive at all and so I would not suggest this approach universally. Nevertheless, as the application developer, it’s up to you to choose the appropriate policy.


 


 


Niall: “Especially when the environment is such that you cannot deploy a fix at a day’s notice, having the users restarting the application frequently when you could provide better is basically developing to suit the developers instead of the users.”


 


If it turns out that your application is crashing “frequently” enough that users have to restart it all the time, then it wasn’t ready to ship in the first place. By the time you’re ready to ship, you should have uncovered nearly all of the exceptions that you need to handle or prevent, and those that remain should be extremely uncommon. Besides careful testing on your end during the development cycle, it’s important to provide regular preview releases for your customers and partners to pound on. During that time, it will be invaluable that exception-related bugs are easy to diagnose and fix quickly. By the time you ship, you will have software which neither crashes nor swallows bugs!


 


 


Niall: “Of course, it can make diagnosing problems more difficult, but the role of the software is to make the user’s job easier, not the developer’s. Sometimes the developer just has to work a bit harder because otherwise the work goes on the user’s plate.”


 


I never meant to imply that ease of diagnosis is an end unto itself. It is a means through which you can deliver better software with fewer defects, which most definitely serves to make the user’s life easier. Indeed, the developer often does have to work harder: uncovering the specific exceptions that need to be handled and the additional code that needs to be in place to prevent the others is often harder than slapping in an additional catch (Exception) block, but it’s the right thing to do…

Comments (22)

  1. I’ve made my first contributions to the FxCop team blog on a subject that’s near and dear to my heart…

  2. ploeh says:

    Thank you for these posts on exception handling. As a consultant, I see a lot of catch(Exception) in customer code, and also often need to explain why an application should be allowed to crash by an unhandled exception.

    I think that one of the reasons why you are meeting so tough resistance is because ‘leaving an application in an inconsistent state’ is a very abstract term. I usually illustrate the concept with this real-life anecdote:

    Back in the old classic ASP days, I had a colleque who always, despite being told otherwise, started his functions with On Error Resume Next (which is closely analogous with try/catch(Exception)). At one point, this led to the customer reporting a bug where all prices on their e-commerce site were 0!

    What had happened was that the code which pulled out prices from the database failed, but since the error was suppressed, all prices took on their data type’s default value, which was zero.

    This is what an incosistent state is. What is better: Letting the web site crash if it can’t connect to the price database, or continue on with all prices being zero?

    I know what the web site owner would prefer.

  3. Jeff Stong says:

    Nick, on the FxCop blog, follows up with some additional discussion about catch(Exception) here and here. …

  4. MSDNArchive says:

    Ploeh, thanks for sharing that example. I agree that it helps get the point across.

  5. Jeremy Gray says:

    Nick, not one person commenting on your posts has ever suggested catching Exception with some kind of hope that the software will be able to magically recover from the situation. That kind of thing takes major work, often more work than is necessary and almost always more than is desirable.

    As for how applications behave when in trouble, this is obviously highly dependent on the type of application in question and I can only conclude that I’m comfortable agreeing to disagree with you on the basis that we must obviously work on very different types of applications.

    Thanks for taking the time to reply in detail on the comments of so many of your readers. This discussion has proven very interesting indeed.

  6. Niall says:

    Nick,

    Of course you’re correct that deployed applications that crash frequently shouldn’t be deployed. However, there are many circumstances that lead to this happening anyway.

    It’s not all dev teams that have a user base and budget as large as MS’s that allow it to do something like have a public beta, that is very close to final functionality, in front of thousands of users, for months, as MS have done with Visual Studio. Even after all the CTPs, betas, RCs, etc, there are still plenty of bugs found pretty quickly in VS 2005.

    In our situation, the cost of taking the user’s time for testing (in terms of the value they’re not adding by doing their job) is far greater than the cost of development time. This leads to a business that has a low level of interest in providing much help in terms of serious testing. The cost of the software being late one day is much greater than the cost of a day of development.

    Of course, the end result is less quality. So as much as we’d like to have lots of testing, so bugs were few and far between, so we could fail fast when, once in a blue moon, an exception occurred, it’s just not going to be the reality. No amount of telling the business "This is not how you write good software" is going to change things.

    So while I agree that the Fail Fast method is the "correct" approach, because it precludes invalid state in the application, I’m not sure how realistic an approach it can be for quite a lot of common application dev scenarios where the business controls dev budget/time frame.

    So I guess if the rule is still relevant to some cases, then it should be kept, and the rest of us can just suppress it, which always gives me a nice warm fuzzy feeling inside 😛

  7. Jeremy: “Nick, not one person commenting on your posts has ever suggested catching Exception with some kind of hope that the software will be able to magically recover from the situation.”

    But when you catch Exception that’s what you are doing; praying that the application is still in a valid state even though you have no idea what went wrong, let alone how to recover from it. TypeLoadException and TypeInitializationException, no worries! InvalidProgramException, easy!, AccessViolationException, not a problem! OutOfMemoryException, StackOverflowException and ExecutionEngineException, without a concern! (Luckily for your users, the last three will now bring your process in .NET 2.0 by default, even in the face of a catch(Exception)).

    Can you see the pattern? The above exceptions are all problems that you more than likely can’t recover from. AccessViolationException is a particularly nasty one; when thrown can indicate that memory has been corrupted, how could you possibly handle that?

  8. JocularJoe says:

    > Applications which catch only the precise exceptions that they are prepared to handle gracefully are easier to develop and maintain over the long run.

    You haven’t provided evidence to back up this assertion.  Apart from the straw man comparison with applications that swallow exceptions willy-nilly, which no-one is suggesting is right.

    I maintain that a WinForms application that (a) uses catch(Exception) only in UI event handlers and (b) always logs the exception and displays an error message is just as easy to develop and maintain.

  9. bkg4211 says:

    I think that both camps are correct.  The approach that I have taken to this is based on the following design goals for a custom web portal that I have developed:

    1) Minimize use of try…catch, explicitly handle exceptions only if you can add value by catching the exception.

    2) Do not ignore exceptions so they ‘disappear’

    3) Do not allow ‘After market’ extensions (e.g. Portlets) to halt portlet processing

    4) Ensure consistency of state in Session and Application caches

    The exception handling strategy that I adopted as a result of these considerations is a hybrid of the two approaches discussed here.  And was implemented in this manner:

    1) The HttpApplication::OnError event is monitored for the entire lifetime of a request.  If an unhandled exception occurs, it will end up in this handler.  Here we log the error and we decide what will be presented to the user.  This handler also implements obfuscation algorithms to ensure that we do not display priviledged information to users on the web (e.g. File path, credentials, database connection strings, etc).

    2) The framework CORE catches explicit exceptions and allows all others to flow through.

    3) Framework EXTENSIBILITY components catch the generic Exception class.  In the case of a portlet, this processing was implemented in OnInit, OnLoad, etc and the generation of ANY exception caused the exception to be logged and a message to be dispalyed with that portlets content area.  This allows a page to be rendered even if one or more portlets that make up the page fail.  The Global Error handler and the portlet error processing both use the same component for logging and message generation.  Additionally once a portlet errors out, no further processing is performed for that particular portlet.

    4) Store only read-only state in caches.  Do not add/refresh objects in cache(s) if an error occurs.

    I take not that the guidance indicates that you should not catch Exception in ‘Most cases’.  I believe that the strategy outlined is an exception and addresses the very concerns that you have raised.

    I also take note that these blogs seem to have their own assumption that the exceptions ‘disappear’ or are masked.  

    I do not think that this is what people are advocating.  Often (Particularly for stateles/web architectures) where state does not persist across requests (With the possible exception of Session and/or Application caches) the objective of error handling is not necessarily return to a consistent state, but a is more about logging the error, notifying developers/administrators of the problem and providing the user a MEANINGFUL prompt describing the nature of the problem.

    I think a strategy (Particularly for a web app) that uses try…catch excessively will see it’s performance suffer and is not advisable.

    Likewise I think a strategy to catch only what you "know" is too risky, particularly for web apps where the consequences of inadvertent disclosure of priviledged information could be very damaging is not advisable.

  10. Joe says:

    > … no idea what went wrong, let alone how to recover from it. TypeLoadException and TypeInitializationException, no worries!

    Sure there are no worries.  Consider a button click event handler in the WinForms presentation tier of a typical n-tier business app.  It surrounds calls to the business tier with try/catch(Exception).  In the catch block, the exception is logged, and the user is informed that the business logic has not been executed.

    So what if the business tier throws a TypeLoadException or TypeInitializationException (perhaps because the wrong version of a component it uses is loaded due to a deployment error)?  Or a RemotingException or a SoapException?  Catching the exception as above is certainly no worse than crashing – and may be better if there is other independent business logic that can be accessed from the same client app.

    > AccessViolationException, not a problem!

    There are of course certain critical errors from which it may be impossible to recover, and an app that catches such an error is likely to fail again fairly soon after catching it.  But what is so bad about that?  Even failing fast here only amounts to slamming the stable door shut after the horse has bolted, as unsafe code has obviously already been run.

  11. Joe says:

    Just one last example 😉

    The sample code below is a simplified version of the pattern I expect developers in my team to use for a WinForms client app that synchronously retrieves and displays data from the business tier.  The pattern for saving data is similar.

    We have used this pattern in dozens of business applications.

    Calls to the business tier are wrapped in a try/catch(Exception).  The business tier is instantiated using the Service Locator pattern, and can be in-process (loaded using reflection), or remote (Remoting, WebService).

    The business tier is stateless.

    The client tier doesn’t know or care what exceptions can be thrown from the business tier itself, or from the service locator (e.g. RemotingException, SoapException, SomeNewException in the future when we use WCF).

    This is really easy to debug.  If an exception is thrown, it is logged, an error message is displayed, and the application stays running with its state unchanged.  All I need to do is attach a debugger and click the button again, and if the exception was due to a programming error, it’s almost certain it will happen again.

    No need to restart the application and get it in to the same state before debugging as there would be if I just let the application crash.

    Note that there is no exception handler wrapping the code to update the UI.  I don’t expect exceptions here and if controls are partially updated the application state may be inconsistent.

    I can see a possible benefit of "if (IsCriticalException(ex)) throw;" in the exception handler.  But our apps are close to 100% managed code and I can’t ever remember seeing any.  I really think you overstate the dangers of failing to catch critical exceptions – they are rare in managed business apps.

    private void myButton_Click(object sender, EventArgs e)

    {

       MyDataTransferObject dto;

       try

       {

           Cursor savedCursor = Cursor.Current;

           try

           {

               Cursor.Current = Cursors.WaitCursor;

               IMyBusinessService myService = ServiceLocator.GetMyBusinessService();

               dto = myService.RetrieveSomeData();

           }

           finally

           {

               Cursor.Current = savedCursor;

           }

       }

       catch (Exception ex)

       {

           // … log exception

           // … Display error message to end user

    // … and return without updating UI controls

           return;

       }

       // … update UI controls with data in dto

       // … typically using WinForms data binding

       // … NB no exception handler here

    }

  12. MSDNArchive says:

    bkg4211, Joe, I will grant that if your architecture is fault-tolerant using stateless components, immutable shared state, transactional state, etc., then it is possible to introduce general catch handlers without falling in to the pitfalls that I discussed in my blog entries. I meant to convey  exactly that in the third response to Jeremy above, but I apologize if it was unclear.

    So, to be absolutely clear, if you’ve thought hard about the potential impact on your application state and taken the right architectural steps to preserve its consistency, then I do not dispute that as a valid reason to exclude the FxCop warnings.

    If I have more time, I will write a follow-up post with code examples showing architectures that can afford catch (Exception) in places.

    Thank you for sharing these examples. I do think that it helps balance the discussion appropriately.

  13. Joe says:

    > … a follow-up post with code examples showing architectures that can afford catch (Exception) in places.

    I’d like to see examples of this kind find their way into the FxCop documentation.  By describing the few cases where it’s appropriate to catch Exception, you’re reinforcing the message that it’s almost always a bad idea to do it anywhere else.  Which is definitely a good thing – FxCop is invaluable for locating the lazy use of "catch{}" which I often see from inexperienced developers.

    My personal list includes:

    1. Top-level handlers in the presentation tier, provided application state has not become inconsistent as a result of the exception.  These handlers must log the exception and inform the user that an error has occurred.

    2. At a tier boundary, to log and rethrow the exception.  It is often better to log exceptions on the server in a distributed deployment.

    3. At a tier boundary, to wrap the exception to fulfill an exception contract.  E.g. a WebService always throws a SoapException.

    4. At a tier boundary, to log the exception and throw a different exception.  E.g. for security reasons, to prevent the client from receiving exception details from the server.

    5. To wrap calls to code which is not core to the main application functionality, as long as calling the code does not cause the application state to become inconsistent.  This definition is a bit woolly, examples include:

    a) Exceptions from calls to Logging/Instrumentation methods should generally not be propagated.  The designers of log4net were right to swallow exceptions by default (and probably also right to provide a configuration option to allow exceptions to be propagated for those who want it).

    b) Portlets and similar in web applications, like bkg4211’s example.

    c) Add-ins.  For example a VSTO add-in should not be allowed to crash Microsoft Word.

    d) An example from the .NET Framework: GetDesignTimeHtml() for WebControl Designers has a general exception handler.  An exception thrown while generating the design-time rendering of a control should not be allowed to crash Visual Studio.

  14. MSDNArchive says:

    I meant to point out one more thing…

    JocularJoe: "No need to restart the application and get it in to the same state before debugging as there would be if I just let the application crash."

    Actually, the unhandled exception dialog will allow you to click debug and take you right to the application as it’s about to throw. There’s no need to restart the application.

    JocularJoe: "… if the exception was due to a programming error, it’s almost certain it will happen again."

    This has not been my experience at all. Many programmer errors can result in race conditions and are therefore hard to reproduce. Missing an opportunity to debug such an issue can be very costly!

  15. Jeff Stong says:

    Krzysztof Cwalina notes that while Framework Design Guidelines ban the use of the ApplicationException…

  16. Keith Patrick says:

    I like the notion of catching only what is declared.  Even though it may lead to a crash, it lowers the risk of corrupting data due to unpredicted "exception state" execution as the execution path is blown anyway, and does not hide the exception (we’ve got some deeeep in base code, and it’s taken years to flush most of them out as data changes expose the breaks).

    What I’d like to see to get out of this mess with unknown exceptions is an FxCop rule that runs against the outputed XML documentation to determine if a method call is declaring all of the possible leaks.  So basically, if you aren’t handling, say, new Icon(…) throwing an ArgumentNullException, it will tell you either to wrap a catch around the call or if it needs to bubble up, declare a /// <exception/> block

  17. A few months ago there was a series of posts on the FxCop blog about the FxCop rule "Do not catch general

  18. This is the first installment in a three-part series on why FxCop warns against catch(Exception): FAQ:

  19. This is the second installment in a three-part series on why FxCop warns against catch(Exception): FAQ:

  20. Krzysztof Cwalina, owner of the Framework Design Guidelines , has written a great post on How to Design