The True Cost of .NET Exceptions — Solution

Well, once again my elite readers have made a solution posting by me nearly redundant.  There were many good viewpoints in response to my article including some especially excellent comments that were spot on.

Now there are several things worth saying, but I think perhaps the most important one, in my opinion, was hit on directly by Ian Griffiths and James Curran. 

Ian: “Presumably throwing an exception causes the CPU to fetch code and data it would otherwise not have executed…”

James: “I think what Jon’s article is overlooking is that his timing is done in isolation…”

And these are crucial observations that I talked about in a previous article (see the section called “Fear the Interference” at the end).  In this particular case we’re doing nothing but throwing exceptions.  So naturally all the code associated with the throw path is in the cache, all the data for resolving the locations of the catch blocks is in the cache, etc. etc.  Basically what we have here is much closer to a measurement of the minimum an exception could possibly cost than a typical cost.   Fair enough, minimum cost is useful too, but it is sort of an understatement by definition.

I’ve mentioned some of the additional costs but let me make a list that’s somewhat more complete.

  • In a normal situation you can expect additional cache misses due to the thrown exception accessing resident data that is not normally in the cache
  • In a normal situation you can expect additional page faults due to the thrown exception accessing non-resident code and data, not normally in your workingset
    • noteably throwing the exception will require the CLR to find the location of the finally and catch blocks based on the current IP and the return IP of every frame until the exception is handled plus the filter block (if any, VB can have one)
    • additional construction cost and name resolution in order to create the frames for diagnostic purposes, including reading of metadata etc.
    • both of the above items typically access “cold” code and data hence hard page faults are probable if you have memory pressure at all
      • we try to put code and data that is used infrequently far from data that is used frequently to improve locality, this works against you if you force the cold to be hot.
      • the cost of the hard page faults, if any, will dwarf everything else
  • Typical catch situations are significantly deeper than the test case therefore the above effects would tend to be magnified (increasing the likelihood of page faults)
  • Finally blocks have to be run anyway so their cost isn’t really “exceptional” however whatever code is in the filter blocks (if present) and catch block is subject to the same issues as the above

Some people thought the ring transition cost was important, in my opinion it isn’t especially and it is at least somewhat accounted for in the test as written (except to the same extent that everything else is cheaper because the whole test fits comfortably in L2).  I’m not sure there even is a ring transition for a thrown exception in fact, although some kernel32 code runs, I think everything that needs to be done can be done in user mode.  But in any case that cost would be included.

Generally, microbenchmarks are designed to hide certain costs and magnify others.  This is not a bad thing, the trick is to make sure you know what you’re getting.

  1. Is the metric the one of interest to you?
  2. Is the workload representative?
  3. Is the environment representative?

Just those three quick checks will get you far.

Comments (7)

  1. Jeff Stong says:

    Wondering about the performance costs of throwing and catching exceptions in your application? …

  2. nativecpp says:

    I will think that unmanaged code also follows the same rules (page fault, etc). If so, why would that be such a big topic/argument in .NET and not in unmanaged ?? I don’t recall people talking much about the cost of exception in unmanaged code as we don’t throw needlessly.

    I think it is important for .NET programmers (myself included) to know that just because a feature is easily avail does NOT mean it should be used freely.

    I have been working with .NET/C# for a couple of years and see people abusing the power of C#/.NET.

    Rico, keep the good topics coming. I love reading your blog which to me is thought-provoking.


  3. Al Tenhundfeld says:

    Hi Rico, I just discovered your blog recently, and I must say that I’m impressed by the quality of your posts and your readers.

    I was reading back through some old posts on exceptions, and I came across this "almost rule":…but on the other hand throwing an exception due to invalid user input, or badly formatted text from an external system, could be a bad idea.

    I recognize that you emphasize "could," but I’m curious what other strategies people use.

    [Long boring description of a previous project]

    On a recent project, we decided to use exceptions in the facade layer to communicate business rules violations and invalid user input back to the web UI. We tried to validate as much as possible client-side in the web UI, but in the domain/facade layer we also validated all business rules and user input. Our strategy was to define a custom exception with property that exposed a collection of error messages. We would validate as much as possible before throwing the exception.

    For example, we would validate a data object, appending any violations to a single custom exception. When validation was finished, we would throw that one exception, which contained a collection of violation objects, up to the UI. It was actually slightly more sophisticated, using a generational approach based on dependencies, but the gist is correct. Would you say this approach sounds reasonable from my description? To me, this strategy just seemed to have the most intuitive design, and we took steps to minimize the number of exceptions being thrown. We also instrumented the facade pretty well; so we could see which violations occurred more often and improve client-side validation or tweak UI flow to reduce them.

    [/boring description]

    We considered forcing all of our facade API methods to return error information, but that felt hacky to us and put the burden on our API consumers to inspect the returned objects. We were also developing a platform, not just a single app; so we wanted to be certain the UI developers would always know when something was rejected. We also briefly considered an approach involving callback delegates, but that seemed to be incompatible with our SOA goals.

    Anyhow, sorry if this is a little offtopic. I’m just at the point in my career where I’m transitioning from developer to architect, and I’ve never really found a satisfactory solution for this issue of communicating business rules/user input validations across layers/tiers.


  4. ricom says:

    Hi Al, glad you’re enjoying the postings.  Welcome to perf tidbits :)

    It’s no surprise that you’ve never found a satisfactory solution to communicating errors across layers because I doubt there is one.  Well, I haven’t found it and I’ve been looking for probably as long as you if not longer.

    What you did in the case above seems reasonable but there is no one universal answer.  Some factors:

    1) The volume of input and granularity of reporting (per field, per record, per batch, per disk drive :)

    2) The frequency of errors

    3) The time/latency requirements if any

    4) The amount of slack you otherwise have available (i.e. how perfect do you have to be)

    That’s a short list, the real list is longer but those basic factors talk about some key things.  

    Do you design an API like Parse? or TryParse?  There’s nothing "wrong" with Parse it’s a fine API and plenty of times it’s what you need.  On the other hand TryParse is indespensible when needed.

    I don’t know a universal answer.

  5. Marcel Popescu says:

    Hmm… is the following conclusion correct, in that case?

    New rule: throwing ONE exception rarely is very costly; throwing a LOT of exception, often, is not a big deal.

  6. ricom says:

    No, I don’t think I can support that rule.  The logic in the sample for instance would have been probably 10000 times faster without the exception, even at the discount rate.

    You’re talking about the difference between an operation that you can do say several O(10^4) per second versus branching which you can do say O(10^8) times per second.

    And if you’re doing it often, then probably you really should be branching rather than throwing.

    Losing a factor — conservatively — of 10^4 is a big deal.

  7. I had an interesting email conversation here at work recently and as usual I can’t share it all but I