The Exception Model


I had
hoped this article would be on changes to the next version of the CLR which
allow it to be hosted inside SQL Server and other “challenging”
environments.  This is more
generally interesting than you might think, because it creates an opportunity
for other processes (i.e. your
processes) to host the CLR with a similar level of integration and control. style="mso-spacerun: yes">  This includes control over memory usage,
synchronization, threading (including fibers), extended security models,
assembly storage, and more.


"urn:schemas-microsoft-com:office:office" /> size=2> 


However,
that topic is necessarily related to our next release, and I cannot talk about
deep details of that next release until those details have been publicly
disclosed.  In late October,
Microsoft is holding its PDC and I expect us to disclose many details at that
time.  In fact, I’m signed up to be
a member of a PDC panel on this topic. 
If you work on a database or an application server or a similarly
complicated product that might benefit from hosting the CLR, you may want to
attend.


size=2> 


After
we’ve disclosed the hosting changes for our next release, you can expect a blog
on hosting in late October or some time in November.


size=2> 


Instead,
this blog is on the managed exception model. style="mso-spacerun: yes">  This is an unusual topic for me. style="mso-spacerun: yes">  In the past, I’ve picked topics where I
can dump information without having to check any of my facts or do any
research.  But in the case of
exceptions I keep finding questions I cannot answer. style="mso-spacerun: yes">  At the top level, the managed exception
model is nice and simple.  But – as
with everything else in software – the closer you look, the more you
discover.


size=2> 


So for
the first time I decided to have some CLR experts read my blog entry before I
post it.  In addition to pointing
out a bunch of my errors, all the reviewers were unanimous on one point: I
should write shorter blogs.


size=2> 


Of
course, we can’t talk about managed exceptions without first considering Windows
Structured Exception Handling (SEH). 
And we also need to look at the C++ exception model. style="mso-spacerun: yes">  That’s because both managed exceptions
and C++ exceptions are implemented on top of the underlying SEH mechanism, and
because managed exceptions must interoperate with both SEH and C++
exceptions.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Windows
SEH


size=2> 


Since
it’s at the base of all exception handling on Windows, let’s look at SEH
first.  As far as I know, the
definitive explanation of SEH is still Matt Pietrek’s excellent 1997 article for
Microsoft Systems Journal:
href="http://www.microsoft.com/msj/0197/exception/exception.aspx"> face=Tahoma
size=2>http://www.microsoft.com/msj/0197/exception/exception.aspx face=Tahoma size=2>.  There have
been some extensions since then, like vectored exception handlers, some security
enhancements, and the new mechanisms to support IA64 and AMD64. style="mso-spacerun: yes">  (It’s hard to base exceptions on FS:[0]
chains if your processor doesn’t have an FS segment register). style="mso-spacerun: yes">  We’ll look at all these changes
shortly.  But Matt’s 1997 article
remains a goldmine of information. 
In fact, it was very useful to the developers who implemented exceptions
in the CLR.


size=2> 


The SEH
model is exposed by MSVC via two constructs:


size=2> 



  1. style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l0 level1 lfo1"> style="FONT-FAMILY: 'Lucida Console'">__try {…}
    __except(filter_expression) {…}

  2. style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l0 level1 lfo1"> style="FONT-FAMILY: 'Lucida Console'">__try {…} __finally
    {…}

size=2> 


Matt’s
article explains how the underlying mechanism of two passes over a chain of
single callbacks is used to provide try/except/finally semantics. style="mso-spacerun: yes">  Briefly, the OS dispatches an exception
by retrieving the head of the SEH chain from TLS. style="mso-spacerun: yes">  Since the head of this chain is at the
top of the TIB/TEB (Thread Information Block / Thread Environment Block,
depending on the OS and the header file you look at), and since the FS segment
register provides fast access to this TLS block on X86, the SEH chain is often
called the FS:[0] chain.


size=2> 


Each
entry consists of a next or a prev pointer (depending on how you look at it) and
a callback function.  You can add
whatever data you like after that standard entry header. style="mso-spacerun: yes">  The callback function is called with all
sorts of additional information related to the exception that’s being
processed.  This includes the
exception record and the register state of the machine which was captured at the
time of the exception.


size=2> 


To
implement the 1st form of MSVC SEH above (__try/__except), the
callback evaluates the filter expression during the first pass over the handler
chain.  As exposed by MSVC, the
filter expression can result in one of three legal values:


size=2> 


style="FONT-FAMILY: 'Lucida Console'">EXCEPTION_CONTINUE_EXECUTION
= -1


style="FONT-FAMILY: 'Lucida Console'">EXCEPTION_CONTINUE_SEARCH =
false 0


style="FONT-FAMILY: 'Lucida Console'">EXCEPTION_EXECUTE_HANDLER =
true 1


size=2> 


Of
course, the filter could also throw its own exception. style="mso-spacerun: yes">  That’s not generally desirable, and I’ll
discuss that possibility and other flow control issues later.


size=2> 


But if
you look at the underlying SEH mechanism, the handler actually returns an
EXCEPTION_DISPOSITION:


size=2> 


style="FONT-FAMILY: 'Lucida Console'">typedef enum
_EXCEPTION_DISPOSITION


style="FONT-FAMILY: 'Lucida Console'"> size=2>{


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
ExceptionContinueExecution,


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
ExceptionContinueSearch,


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
ExceptionNestedException,


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
ExceptionCollidedUnwind


style="FONT-FAMILY: 'Lucida Console'">}
EXCEPTION_DISPOSITION;


size=2> 


So
there’s some mapping that MSVC is performing here. style="mso-spacerun: yes">  Part of that mapping is just a trivial
conversion between the MSVC filter values and the SEH handler values. style="mso-spacerun: yes">  For instance ExceptionContinueSearch has
the value 1 at the SEH handler level but the equivalent
EXCEPTION_CONTINUE_SEARCH has the value 0 at the MSVC filter level. style="mso-spacerun: yes">  Ouch.


size=2> 


But the
other part of the mapping has to do with a difference in functionality. style="mso-spacerun: yes">  For example, ExceptionNestedException
and ExceptionCollidedUnwind are primarily used by the OS dispatch mechanism
itself.  We’ll see the circumstances
in which they arise later.  More
importantly, MSVC filters can indicate that the __except clause should run by
returning EXCEPTION_EXECUTE_HANDLER. 
But we shall see that at the SEH level this decision is achieved by
having the exception dispatch routine fix up the register context and then
resuming execution at the right spot.


size=2> 


The
EXCEPTION_CONTINUE_EXECUTION case supports a rather esoteric use of SEH. style="mso-spacerun: yes">  This return value allows the filter to
correct the problem that caused the exception and to resume execution at the
faulting instruction.  For example,
an application might be watching to see when segments are being written to so
that it can log this information. 
This could be achieved by marking the segment as ReadOnly and waiting for
an exception to occur on first write. 
Then the filter could use VirtualProtect to change the segment containing
the faulting address to ReadWrite and then restart the faulting
instruction.  Alternatively, the
application could have two VirtualAllocs for each region of memory. style="mso-spacerun: yes">  One of these could be marked as ReadOnly
and the second could be a shadow that is marked as ReadWrite. style="mso-spacerun: yes">  Now the exception filter can simply
change the register state of the CPU that faulted, so that the register
containing the faulting address is changed from the ReadOnly segment to the
shadowed ReadWrite segment.


size=2> 


size=2>Obviously anyone who is playing these games must have a lot of
sophistication and a deep knowledge of how the program executes. style="mso-spacerun: yes">  Some of these games work better if you
can constrain the code that’s generated by your program to only touch faulting
memory using a predictable cliché like offsets from a particular
register.


size=2> 


I’ll
talk about this kind of restartable or resumable exception in the context of
managed code later.  For now, let’s
pretend that the filter either returns “true – I would like my ‘except’ clause
to handle this exception” or “false – my ‘except’ clause is uninterested in this
exception”.  If the filter returns
false, the next SEH handler is fetched from the chain and it is asked this same
question.


size=2> 


The OS
is pretty paranoid about corrupt stacks during this chain traversal. style="mso-spacerun: yes">  It checks that all chain entries are
within the bounds of the stack. 
(These bounds are also recorded in the TEB). style="mso-spacerun: yes">  The OS also checks that all entries are
in ascending order on the stack.  If
you violate these rules, the OS will consider the stack to be corrupt and will
be unable to process exceptions. 
This is one of the reasons that a Win32 application cannot break its
stack into multiple disjoint segments as an innovative technique for dealing
with stack overflow.


size=2> 


Anyway,
eventually a handler says “true – I would like my ‘except’ clause to handle this
exception”.  That’s because there’s
a backstop entry at the end of the chain which is placed there by the OS when
the thread is created.  This last
entry wants to handle all the exceptions, even if your application-level
handlers never do.  That’s where you
get the default OS behavior of consulting the unhandled exception filter list,
throwing up dialog boxes for Terminate or Debug, etc.


size=2> 


As soon
as a filter indicates that it wants to handle an exception, the first pass of
exception handling finishes and the second pass begins. style="mso-spacerun: yes">  As Matt’s article explains, the handler
can use the poorly documented RtlUnwind service to deliver second pass
notifications to all the previous handlers and pop them off the handler
chain.


size=2> 


In other
words, no unwinding happened as the first pass progressed. style="mso-spacerun: yes">  But during the second pass we see two
distinct forms of unwind.  The first
form involves popping SEH records from the chain that was threaded from
TLS.  Each such SEH record is popped
before the corresponding handler gets called for the second pass. style="mso-spacerun: yes">  This leaves the SEH chain in a
reasonable form for any nested exceptions that might occur within a
handler.


size=2> 


The
other form of unwind is the actual popping of the CPU stack. style="mso-spacerun: yes">  This doesn’t happen as eagerly as the
popping of the SEH records.  On X86,
EBP is used as the frame pointer for methods containing SEH. style="mso-spacerun: yes">  ESP points to the top of the stack, as
always.  Until the stack is actually
unwound, all the handlers are executed on top of the faulting exception
frame.  So the stack actually grows
when a handler is called for the first or second pass. style="mso-spacerun: yes">  EBP is set to the frame of the method
containing a filter or finally clause so that local variables of that method
will be in scope.


size=2> 


The
actual popping of the stack doesn’t occur until the catching ‘except’ clause is
executed.


size=2> 


So we’ve
got a handler whose filter announced in the first pass that it would handle this
exception via EXCEPTION_EXECUTE_HANDLER. 
And that handler has driven the second pass by unwinding and delivering
all the second pass notifications. 
Typically it will then fiddle with the register state in the exception
context and resume execution at the top of the appropriate ‘except’ clause. style="mso-spacerun: yes">  This isn’t necessarily the case, and
later we’ll see some situations where the exception propagation gets
diverted.


size=2> 


How
about the try/finally form of SEH? 
Well, it’s built on the same underlying notion of a chain of
callbacks.  During the first pass
(the one where the filters execute, to decide which except block is going to
catch), the finally handlers all say EXCEPTION_CONTINUE_SEARCH. style="mso-spacerun: yes">  They never actually catch anything. style="mso-spacerun: yes">  Then in the second pass, they execute
their finally blocks.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Subsequent
additions to SEH


size=2> 


All of
the above – and a lot more – is in Matt’s article. style="mso-spacerun: yes">  There are a few things that aren’t in
his article because they were added to the model later.


size=2> 


For
example, Windows XP introduced the notion of a vectored exception handler. style="mso-spacerun: yes">  This allows the application to register
for a first crack at an exception, without having to wait for exception handling
to propagate down the stack to an embedded handler. style="mso-spacerun: yes">  Fortunately, Matt wrote an “Under The
Hood” article on this particular topic. 
This can be found at
href="http://msdn.microsoft.com/msdnmag/issues/01/09/hood/default.aspx"> face=Tahoma
size=2>http://msdn.microsoft.com/msdnmag/issues/01/09/hood/default.aspx face=Tahoma size=2>.


size=2> 


Another
change to SEH is related to security. 
Buffer overruns – whether on the stack or in heap blocks – remain a
favorite attack vector for hackers. 
A typical buffer overrun attack is to pass a large string as an argument
to an API.  If that API expected a
shorter string, it might have a local on the stack like “char
filename[256];”.  Now if the API is
foolish enough to strcpy a malicious hacker’s argument into that buffer, then
the hacker can put some fairly arbitrary data onto the stack at addresses higher
(further back on the stack) than that ‘filename’ buffer. style="mso-spacerun: yes">  If those higher locations are supposed
to contain call return addresses, the hacker may be able to get the CPU to
transfer execution into the buffer itself. 
Oops.  The hacker is
injecting arbitrary code and then executing it, potentially inside someone
else’s process or under their security credentials.


size=2> 


There’s
a new speed bump that an application can use to reduce the likelihood of a
successful stack-based buffer overrun attack. style="mso-spacerun: yes">  This involves the /GS C++ compiler
switch, which uses a cookie check in the function epilog to determine whether a
buffer overrun has corrupted the return address before executing a return based
on its value.


size=2> 


However,
the return address trick is only one way to exploit buffer overruns. style="mso-spacerun: yes">  We’ve already seen that SEH records are
necessarily built on the stack.  And
in fact the OS actually checks to be sure they are within the stack bounds. style="mso-spacerun: yes">  Those SEH records contain callback
pointers which the OS will invoke if an exception occurs. style="mso-spacerun: yes">  So another way to exploit a buffer
overrun is to rewrite the callback pointer in an SEH record on the stack. style="mso-spacerun: yes">  There’s a new linker switch (/SAFESEH)
that can provide its own speed bump against this sort of attack. style="mso-spacerun: yes">  Modules built this way declare that all
their handlers are embedded in a table in the image; they do not point to
arbitrary code sequences sprinkled in the stack or in heap blocks. style="mso-spacerun: yes">  During exception processing, the
exception callbacks can be validated against this table.


size=2> 


Of
course, the first and best line of defense against all these attacks is to never
overrun a buffer.  If you are
writing in managed code, this is usually pretty easy. style="mso-spacerun: yes">  You cannot create a buffer overrun in
managed code unless the CLR contains a bug or you perform unsafe operations
(e.g. unverifiable MC++ or ‘unsafe’ in C#) or you use high-privilege unsafe APIs
like StructureToPtr or the various overloads of Copy in the
System.Runtime.InteropServices.Marshal class.


size=2> 


So, not
surprisingly and not just for this reason, I recommend writing in managed
code.  But if you must write some
unmanaged code, you should seriously consider using a String abstraction that
eliminates all those by-rote opportunities for error. style="mso-spacerun: yes">  And if you must code each strcpy
individually, be sure to use strncpy instead!


size=2> 


A final
interesting change to the OS SEH model since Matt’s article is due to
Win64.  Both IA64 and AMD64 have a
model for exception handling that avoids reliance on an explicit handler chain
that starts in TLS and is threaded through the stack. style="mso-spacerun: yes">  Instead, exception handling relies on
the fact that on 64-bit systems we can perfectly unwind a stack. style="mso-spacerun: yes">  And this ability is itself due to the
fact that these chips are severely constrained on the calling conventions they
support.


size=2> 


If you
look at X86, there are an unbounded number of calling conventions possible. style="mso-spacerun: yes">  Sure, there are a few common well-known
conventions like stdcall, cdecl, thiscall and fastcall. style="mso-spacerun: yes">  But optimizing compilers can invent
custom calling conventions based on inter-procedural analysis. style="mso-spacerun: yes">  And developers writing in assembly
language can make novel decisions about which registers to preserve vs. scratch,
how to use the floating point stack, how to encode structs into registers,
whether to back-propagate results by re-using the stack that contained in-bound
arguments, etc.  Within the CLR, we
have places where we even unbalance the stack by encoding data after a CALL
instruction, which is then addressable via the return address. style="mso-spacerun: yes">  This is a particularly dangerous game
because it upsets the branch prediction code of the CPU and can cause prediction
misses on several subsequent RET instructions. style="mso-spacerun: yes">  So we are careful to reserve this
technique for low frequency call paths. 
And we also have some stubs that compute indirect JMPs to out-of-line RET
‘n’ instructions in order to rebalance the stack.


size=2> 


It would
be impossible for a stack crawler to successfully unwind these bizarre stacks
for exception purposes, without completely simulating arbitrary code
execution.  So on X86 the exception
mechanism must rely on the existence of a chain of crawlable FS:[0] handlers
that is explicitly maintained.


size=2> 


size=2>Incidentally, the above distinction between perfect stack crawling on
64-bit systems vs. hopeless stack crawling on X86 systems has deeper
repercussions for the CLR than just exception handling. style="mso-spacerun: yes">  The CLR needs the ability to crawl all
the managed portions of a thread’s stack on all architectures. style="mso-spacerun: yes">  This is a requirement for proper
enforcement of Code Access Security; for accurate reporting of managed
references to the GC; for hijacking return addresses in order to asynchronously
take control of threads; and for various other reasons. style="mso-spacerun: yes">  On X86, the CLR devotes considerable
resources to achieving this.


size=2> 


Anyway,
on 64-bit systems the correspondence between an activation record on the stack
and the exception record that applies to it is not achieved through an FS:[0]
chain.  Instead, unwinding of the
stack reveals the code addresses that correspond to a particular activation
record.  These instruction pointers
of the method are looked up in a table to find out whether there are any
__try/__except/__finally clauses that cover these code addresses. style="mso-spacerun: yes">  This table also indicates how to proceed
with the unwind by describing the actions of the method epilog.


size=2> 


style="mso-bidi-font-weight: normal"> size=2> 


style="mso-bidi-font-weight: normal">Managed
Exceptions


size=2> 


Okay,
enough about SEH – for now.  Let’s
switch to the managed exception model. 
This model contains a number of constructs. style="mso-spacerun: yes">  Depending on the language you code in,
you probably only have access to a subset of these.


size=2> 


style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'">try {…} finally
{…}


This is
pretty standard.  All managed
languages should expose this, and it should be the most common style of
exception handling in user code.  Of
course, in the case of MC++ the semantics of ‘finally’ is exposed through
auto-destructed stack objects rather than through explicit finally clauses. style="mso-spacerun: yes">  You should be using ‘finally’ clauses to
guarantee consistency of application state far more frequently than you use
‘catch’ clauses.  That’s because
catch clauses increase the likelihood that developers will swallow exceptions
that should be handled elsewhere, or perhaps should even be left unhandled. style="mso-spacerun: yes">  And if catch clauses don’t actually
swallow an exception (i.e. they ‘rethrow’), they still create a poor debugging
experience as we shall see.


size=2> 


style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'">try {…} catch (Object o)
{…}


This is
pretty standard, too.  One thing
that might surprise some developers is that you can catch any instance that’s of
type Object or derived from Object. 
However, there is a CLS rule that only subtypes of System.Exception
should be thrown.  In fact, C# is so
eager for you to only deal with System.Exception that it doesn’t provide any
access to the thrown object unless you are catching Exception or one of its
subtypes.


size=2> 


When you
consider that only Exception and its subtypes have support for stack traces,
HRESULT mapping, standard access to exception messages, and good support
throughout the frameworks, then it’s pretty clear that you should restrict
yourself to throwing and processing exceptions that derive from
Exception.


size=2> 


In
retrospect, perhaps we should have limited exception support to Exception rather
than Object.  Originally, we wanted
the CLR to be a useful execution engine for more run-time libraries than just
the .NET Frameworks.  We imagined
that different languages would execute on the CLR with their own particular
run-time libraries.  So we didn’t
want to couple the base engine operations too tightly with CLS rules and
constructs in the frameworks.  Of
course, now we understand that the commonality of the shared framework classes
is a huge part of the value proposition of our managed environment. style="mso-spacerun: yes">  I suspect we would revisit our original
design if we still could.


size=2> 


style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: Tahoma">try
{ style="FONT-FAMILY: 'Lucida Console'">… style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: Tahoma">} catch
(Object o) if (expression) { style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'">… style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'; mso-bidi-font-family: Tahoma">}


This is
invented syntax, though I’m told it’s roughly what MC++ is considering. style="mso-spacerun: yes">  As far as I know, the only two .NET
languages that currently support exception filters are VB.NET and – of course –
ILASM.  (We never build a managed
construct without exposing it via ILDASM and ILASM in a manner that allows these
two tools to round-trip between source and binary forms).


size=2> 


VB.NET
has sometimes been dismissed as a language that’s exclusively for less
sophisticated developers.  But the
way this language exposes the advanced feature of exception filters is a great
example of why that position is too simplistic. style="mso-spacerun: yes">  Of course, it is true that VB has
historically done a superb job of providing an approachable toolset and
language, which has allowed less sophisticated developers to be highly
productive.


size=2> 


Anyway,
isn’t this cool:


size=2> 


style="FONT-FAMILY: 'Lucida Console'"> size=2>Try


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   …try
statements…


style="FONT-FAMILY: 'Lucida Console'">Catch e As
InvalidOperationException When expressionFilter


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   …catch
statements…


style="FONT-FAMILY: 'Lucida Console'">End
Try


Of course, at the runtime
level we cannot separate the test for the exception type expression and the
filter expression.  We only support
a bare expression.  So the VB
compiler turns the above catch into
style="FONT-SIZE: 10pt; FONT-FAMILY: Tahoma">something like style="FONT-SIZE: 10pt; FONT-FAMILY: Tahoma">this, where $exception_obj is the
implicit argument passed to the filter.


style="FONT-FAMILY: 'Lucida Console'">Catch When
(IsInst($exception_obj, InvalidOperationException)


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">           
&& expressionFilter)


size=2> 


While
we’re on the topic of exception handling in VB, have you ever wondered how VB
.NET implements its On Error statement?


size=2> 


style="FONT-FAMILY: 'Lucida Console'"> style="mso-tab-count: 1">      On Error { Goto {
<line> | 0 | -1 } | Resume Next }


size=2> 


Me
neither.  But I think it’s pretty
obvious how to implement this sort of thing with an interpreter. style="mso-spacerun: yes">  You wait for something to go wrong, and
then you consult the active “On Error” setting. style="mso-spacerun: yes">  If it tells you to “Resume Next”, you
simply scan forwards to the next statement and away you go.


size=2> 


But in
an SEH world, it’s a little more complicated. style="mso-spacerun: yes">  I tried some simple test cases with the
VB 7.1 compiler.  The resulting
codegen is based on advancing a _Vb_t_CurrentStatement local variable to
indicate the progression of execution through the statements. style="mso-spacerun: yes">  A single try/filter/catch covers
execution of these statements.  It
was interesting to see that the ‘On Error’ command only applies to exceptions
that derive from System.Exception. 
The filter refuses to process any other exceptions.


size=2> 


So VB is
nicely covered.  But what if you did
need to use exception filters from C#? 
Well, in V1 and V1.1, this would be quite difficult. style="mso-spacerun: yes">  But C# has announced a feature for their
next release called anonymous methods. 
This is a compiler feature that involves no CLR changes. style="mso-spacerun: yes">  It allows blocks of code to be mentioned
inline via a delegate.  This
relieves the developer from the tedium of defining explicit methods and state
objects that can be gathered into the delegate and the explicit sharing of this
state.  This and other seductive
upcoming C# features are described at
href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechart/html/vbconcprogramminglanguagefuturefeatures.asp"> face=Tahoma
size=2>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechart/html/vbconcprogramminglanguagefuturefeatures.asp face=Tahoma size=2>.


size=2> 


Using a
mechanism like this, someone has pointed out that one could define delegates for
try, filter and catch clauses and pass them to a shared chunk of ILASM. style="mso-spacerun: yes">  I love the way the C# compiler uses type
inferencing to automatically deduce the delegate types. style="mso-spacerun: yes">  And it manufactures a state object to
ensure that the locals and arguments of DoTryCatch are available to the “try
statements”, “filter expression” and “catch statements”, almost as if everything
was scoped in a single method body. 
(I say “almost” because any locals or arguments that are of byref,
argiterator or typedbyref types cannot be disassociated from a stack without
breaking safety.  So these cases are
disallowed).


size=2> 


I’m
guessing that access to filters from C# could look something like
this:


size=2> 


style="FONT-FAMILY: 'Lucida Console'">public void delegate
__Try();


style="FONT-FAMILY: 'Lucida Console'">public Int32 delegate
__Filter();


style="FONT-FAMILY: 'Lucida Console'">public void delegate
__Catch();


style="FONT-FAMILY: 'Lucida Console'"> size=2> 


style="FONT-FAMILY: 'Lucida Console'">// this reusable helper would
be defined in ILASM or VB.NET:


style="FONT-FAMILY: 'Lucida Console'">void DoTryCatch(__Try t,
__Filter f, __Catch c)


style="FONT-FAMILY: 'Lucida Console'"> size=2> 


style="FONT-FAMILY: 'Lucida Console'">// And C# could then use it
as follows:


style="FONT-FAMILY: 'Lucida Console'">void
m(…arguments…)


style="FONT-FAMILY: 'Lucida Console'"> size=2>{


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
…locals…


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
DoTryCatch(


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">      { …try
statements…},


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">      { return
filter_expression; },


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">      { …catch
statements…}


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   );


style="FONT-FAMILY: 'Lucida Console'"> size=2>}


size=2> 


You may
notice that I cheated a little bit. 
I didn’t provide a way for the ‘catch’ clause to mention the exception
type that it is catching.  Of
course, this could be expressed as part of the filter, but that’s not really
playing fair.  I suspect the
solution is to make DoTryCatch a generic method that has an unbound Type
parameter.  Then DoTryCatch<T>
could be instantiated for a particular type. style="mso-spacerun: yes">  However, I haven’t actually tried this
so I hate to pretend that it would work. 
I am way behind on understanding what we can and cannot do with generics
in our next release, how to express this in ILASM, and how it actually works
under the covers.  Any blog on that
topic is years away.


size=2> 


While we
are on the subject of interesting C# codegen, that same document on upcoming
features also discusses iterators. 
These allow you to use the ‘yield’ statement to convert the normal pull
model of defining iteration into a convenient push model. style="mso-spacerun: yes">  You can see the same ‘yield’ notion in
Ruby.  And I’m told that both
languages have borrowed this from CLU, which pioneered the feature about the
time that I was born.


size=2> 


When you
get your hands on an updated C# compiler that supports this handy construct, be
sure to ILDASM your program and see how it’s achieved. style="mso-spacerun: yes">  It’s a great example of what a compiler
can do to make life easier for a developer, so long as we’re willing to burn a
few more cycles compared to a more prosaic loop construct. style="mso-spacerun: yes">  In today’s world, this is style="mso-bidi-font-style: normal">almost always a sensible
trade-off.


size=2> 


Okay,
that last part has nothing to do with exceptions, does it? style="mso-spacerun: yes">  Let’s get back to the managed exception
model.


size=2> 


style="mso-bidi-font-weight: normal"> style="FONT-FAMILY: 'Lucida Console'">try {…} fault
{…}


Have you
ever written code like this, to restrict execution of your finally clause to
just the exceptional cases?


size=2> 


style="FONT-FAMILY: 'Lucida Console'">bool exceptional =
true;


style="FONT-FAMILY: 'Lucida Console'">try
{


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   …body of
try…


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   exceptional =
false;


style="FONT-FAMILY: 'Lucida Console'">} finally
{


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   if (exceptional)
{…}


style="FONT-FAMILY: 'Lucida Console'"> size=2>}


style="FONT-FAMILY: 'Lucida Console'"> size=2> 


Or how
about a catch with a rethrow, as an alternate technique for achieving finally
behavior for just the exceptional cases:


style="FONT-FAMILY: 'Lucida Console'"> size=2> 


style="FONT-FAMILY: 'Lucida Console'">try
{


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   …


style="FONT-FAMILY: 'Lucida Console'">} catch
{


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">   …


style="FONT-FAMILY: 'Lucida Console'"> style="mso-spacerun: yes">  
rethrow;


style="FONT-FAMILY: 'Lucida Console'"> size=2>}


size=2> 


In each
case, you are accommodating for the fact that your language doesn’t expose fault
blocks.  In fact, I think the only
language that exposes these is ILASM. 
A fault block is simply a finally clause that only executes in the
exceptional case.  It never executes
in the non-exceptional case.


size=2> 


size=2>Incidentally, the first alternative is preferable to the second. style="mso-spacerun: yes">  The second approach terminates the first
pass of exception handling.  This is
a fundamentally different semantics, which has a substantial impact on debugging
and other operations.  Let’s look at
rethrow in more detail, to see why this is the case.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Rethrow,
restartable exceptions, debugging


style="mso-bidi-font-weight: normal"> size=2> 


Gee, my
language has rethrow, but no filter. 
Why can’t I just treat the following constructs as equivalent?


size=2> 


style="FONT-FAMILY: 'Lucida Console'">try {…} filter (expression)
catch (Exception e) {…}


style="FONT-FAMILY: 'Lucida Console'">try {…} catch (Exception e) {
if (!expression) rethrow; …}


size=2> 


In fact,
‘rethrow’ tries hard to create the illusion that the initial exception handling
is still in progress.  It uses the
same exception object.  And it
augments the stack trace associated with that exception object, so that it
includes the portion of stack from the rethrow to the eventual catch.


size=2> 


Hmm, I
guess I should have already mentioned that the stack trace of an Exception is
intentionally restricted to the segment of stack from the throw to the
catch.  We do this for performance
reasons, since part of the cost of an exception is linear with the depth of the
stack that we capture.  I’ll talk
about the implications of exception performance later. style="mso-spacerun: yes">  Of course, you can use the
System.Diagnostics.StackTrace class to gather the rest of the stack from the
point of the catch, and then manually merge it into the stack trace from the
Exception object.  But this is a
little clumsy and we have sometimes been asked to provide a helper to make this
more convenient and less brittle to changes in the formatting of stack
traces.


size=2> 


size=2>Incidentally, when you are playing around with stack traces (whether they
are associated with exceptions, debugging, or explicit use of the StackTrace
class), you will always find JIT inlining getting in your way. style="mso-spacerun: yes">  You can try to defeat the JIT inliner
through use of indirected calls like function pointers, virtual methods,
interface calls and delegates.  Or
you can make the called method “interesting” enough that the JIT decides it
would be unproductive or too difficult to inline. style="mso-spacerun: yes">  All these techniques are flawed, and all
of them will fail over time.  The
correct way to control inlining is to use the
MethodImpl(MethodImplOptions.NoInlining) pseudo-custom attribute from the
System.Runtime.CompilerServices namespace.


size=2> 


One way
that a rethrow differs from a filter is with respect to resumable or restartable
exceptions.  We’ve already seen how
SEH allows an exception filter to return EXCEPTION_CONTINUE_EXECUTION. style="mso-spacerun: yes">  This causes the faulting instruction to
be restarted.  Obviously it’s
unproductive to do this unless the filter has first taken care of the faulting
situation somehow.  It could do this
by changing the register state in the exception context so that a different
value is dereferenced, or so that execution resumes at a different
instruction.  Or it could have
modified the environment the program is running in, as with the VirtualProtect
cases that I mentioned earlier.


size=2> 


In V1
and V1.1, the managed exception model does not support restartable
exceptions.  In fact, I think that
we set EXCEPTION_NONCONTINUABLE on some (but perhaps not all) of our exceptions
to indicate this.  There are several
reasons why we don’t support restartable exceptions:


size=2> 



  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l8 level1 lfo9"> face=Tahoma size=2>In order to repair a faulting situation, the exception
    handler needs intimate knowledge about the execution environment. style="mso-spacerun: yes">  In managed code, we’ve gone to great
    lengths to hide these details. 
    For example, there is no architecture-neutral mapping from the IL
    expression of stack-based execution to the register set of the underlying
    CPU.

size=2> 



  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l8 level1 lfo9"> face=Tahoma size=2>Restartability is often desired for asynchronous
    exceptions.  By ‘asynchronous’ I
    mean that the exception is not initiated by an explicit call to ‘throw’ in the
    code.  Rather, it results from a
    memory fault or an injected failure like Abort that can happen on any
    instruction.  Propagating a
    managed exception, where this involves execution of a managed filter,
    necessarily involves the potential for a GC. style="mso-spacerun: yes">  A JIT has some discretion over the
    GC-safe points that it chooses to support in a method. style="mso-spacerun: yes">  Certainly the JIT must gather GC
    information to report roots accurately at all call-sites. style="mso-spacerun: yes">  But the JIT normally isn’t required to
    maintain GC info for every instruction. 
    If any instruction might fault, and if any such fault could be resumed,
    then the JIT would need GC info for all instructions in all methods. style="mso-spacerun: yes">  This would be expensive. style="mso-spacerun: yes">  Of course, ‘mov eax, ecx’ cannot fault
    due to memory access issues.  But
    a surprising number of instructions are subject to fault if you consider all
    of memory – including the stack – to be unmapped. style="mso-spacerun: yes">  And even ‘mov eax, ecx’ can fault due
    to a Thread.Abort.

size=2> 


If you
were paying attention to that last bullet, you might be wondering how
asynchronous exceptions could avoid GC corruption even without resumption. style="mso-spacerun: yes">  After all, the managed filter will still
execute and we know that the JIT doesn’t have complete GC information for the
faulting instruction.


size=2> 


Our
current solution to this on X86 is rather ad hoc, but it does work. style="mso-spacerun: yes">  First, we constrain the JIT to never
flow the contents of the scratch registers between a ‘try’ clause and any of the
exception clauses (‘filter’, ‘finally’, ‘fault’ and ‘catch’). style="mso-spacerun: yes">  The scratch registers in this case are
EAX, ECX, EDX and sometimes EBP. 
Our JIT compiler decides, method-by-method, whether to use EBP as a
stack-frame register or a scratch register. style="mso-spacerun: yes">  Of course, EBP isn’t really a scratch
register since callees will preserve it for us, but you can see where I’m
going.


size=2> 


Now when
an asynchronous exception occurs, we can discard the state of all the scratch
registers.  In the case of EAX, ECX
& EDX, we can unconditionally zero them in the register context that is
flowed via exception propagation. 
In the case of EBP, we only zero it if we aren’t using EBP as a frame
register.  When we execute a managed
handler, we can now report GC roots based on the GC information that’s
associated with the handler’s instruction pointer.


size=2> 


The
downside to this approach, other than its ad hoc nature, is that it constrains
the codegen of any method that contains exception handlers. style="mso-spacerun: yes">  At some point we may have to model
asynchronous exceptions more accurately, or expand the GC information spewed by
the JIT compiler, or a combination, so that we can enable better code generation
in the presence of exceptions.


size=2> 


We’ve
already seen how VB.NET can use a filter and explicit logic flow from a catch
clause to create the illusion of restartable exceptions to support ‘On Error
Resume Next’.  But this should not
be confused with true restartability.


size=2> 


Before
we leave the topic of rethrow, we should briefly consider the InnerException
property of System.Exception.  This
allows one exception to be wrapped up in the state of another exception. style="mso-spacerun: yes">  A couple of important places where we
take advantage of this are reflection and class construction.


size=2> 


When you
perform late-bound invocation via reflection (e.g. Type.InvokeMember or
MethodInfo.Invoke), exceptions can occur in two places:


size=2> 


style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; tab-stops: list .5in; mso-list: l6 level1 lfo4"> style="mso-fareast-font-family: Tahoma; mso-bidi-font-family: Tahoma"> style="mso-list: Ignore">1) style="FONT: 7pt 'Times New Roman'">     
The reflection infrastructure may
decide that it cannot satisfy your request, perhaps because you passed the wrong
number of arguments, or the member lookup failed, or you are invoking on someone
else’s private members.  That last
one sounds vaguely dirty.


size=2> 


style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; tab-stops: list .5in; mso-list: l6 level1 lfo4"> style="mso-fareast-font-family: Tahoma; mso-bidi-font-family: Tahoma"> style="mso-list: Ignore">2) style="FONT: 7pt 'Times New Roman'">     
The late-bound invocation might
work perfectly, but the target method you called may throw an exception back at
you.  Reflection must faithfully
give you that exception as the result of the call. style="mso-spacerun: yes">  Returning it as an outbound argument,
rather than throwing it at you, would be dangerous. style="mso-spacerun: yes">  We would lose one of the wonderful
properties of exceptions, which is that they are hard to ignore. style="mso-spacerun: yes">  Error codes are constantly being
swallowed or otherwise ignored, leading to fragile execution.


size=2> 


The
problem is that these two sources of exceptions are ambiguous. style="mso-spacerun: yes">  There must be some way to tell whether
the invocation attempt failed or whether the target of the invocation
failed.   Reflection
disambiguates these cases by using an instance of
System.Reflection.TargetInvocationException for the case where the invoked
method threw an exception.  The
InnerException property of this instance is the exception that was thrown by the
invoked method.  If you get any
exceptions from a late-bound invocation other than TargetInvocationException,
those other exceptions indicate problems with the late-bound dispatch attempt
itself.


size=2> 


size=2>Something similar happens with TypeInitializationException. style="mso-spacerun: yes">  If a class constructor (.cctor) method
fails, we capture that exception as the InnerException of a
TypeInitializationException. 
Subsequent attempts to use that class in this AppDomain from this or
other threads will have that same TypeInitializationException instance thrown at
them.


size=2> 


So
what’s the difference between the following three constructs, where the
overloaded constructor for MyExcep is placing its argument into
InnerException:


size=2> 


style="FONT-FAMILY: 'Lucida Console'">try {…} catch (Exception e) {
if (expr) rethrow; …}


style="FONT-FAMILY: 'Lucida Console'">try {…} catch (Exception e) {
if (expr) throw new MyExcep(); …}


style="FONT-FAMILY: 'Lucida Console'">try {…} catch (Exception e) {
if (expr) throw new MyExcep(e); …}


size=2> 


Well,
the 2nd form is losing information. style="mso-spacerun: yes">  The original exception has been
lost.  It’s hard to recommend that
approach.


size=2> 


Between
the 1st and 3rd forms, I suppose it depends on whether the
intermediary can add important information by wrapping the original exception in
a MyExcep instance.  Even if you are
adding value with MyExcep, it’s still important to preserve the original
exception information in the InnerException so that sophisticated programs and
developers can determine the complete cause of the error.


size=2> 


Probably
the biggest impact from terminating the first pass of exception handling early,
as with the examples above, is on debugging. style="mso-spacerun: yes">  Have you ever attached a debugger to a
process that has failed with an unhandled exception? style="mso-spacerun: yes">  When everything goes perfectly, the
debugger pops up sitting in the context of the RaiseException or trap
condition.


size=2> 


That’s
so much better than attaching the debugger and ending up on a ‘rethrow’
statement.  What you really care
about is the state of the process when the initial exception was thrown. style="mso-spacerun: yes">  But the first pass has terminated and
the original state of the world may have been lost. style="mso-spacerun: yes">  It’s clear why this happens, based on
the two pass nature of exception handling.


size=2> 


size=2>Actually, the determination of whether or not the original state of the
world has been lost or merely obscured is rather subtle. style="mso-spacerun: yes">  Certainly the current instruction
pointer is sitting in the rethrow rather than on the original fault. style="mso-spacerun: yes">  But remember how filter and finally
clauses are executed with an EBP that puts the containing method’s locals in
scope… and an ESP that still contains the original faulting method? style="mso-spacerun: yes">  It turns out that the catching handler
has some discretion on whether to pop ESP before executing the catch clause or
instead to delay the pop until the catch clause is complete. style="mso-spacerun: yes">  The managed handler currently pops the
stack before calling the catch clause, so the original state of the exception is
truly lost.  I believe the unmanaged
C++ handler delays the pop until the catch completes, so recovering the state of
the world for the original exception is tricky but possible.


size=2> 


size=2>Regardless, every time you catch and rethrow, you inflict this bitter
disappointment on everyone who debugs through your code. style="mso-spacerun: yes">  Unfortunately, there are a number of
places in managed code where this disappointment is unavoidable.


size=2> 


The most
unfortunate place is at AppDomain boundaries. style="mso-spacerun: yes">  I’ve already explained at
href="http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/56dd7611-a199-4a1f-adae-6fac4019f11b"> face=Tahoma
size=2>http://blogs.gotdotnet.com/cbrumme/PermaLink.aspx/56dd7611-a199-4a1f-adae-6fac4019f11b face=Tahoma size=2> why the Isolation requirement of AppDomains forces us to
marshal most exceptions across the boundary. style="mso-spacerun: yes">  And we’ve just discussed how reflection
and class construction terminate the first pass by wrapping exceptions as the
InnerException of an outer exception.


size=2> 


One
alternative is to trap on all first-chance exceptions. style="mso-spacerun: yes">  That’s because debuggers can have first
crack at exceptions before the vectored exception handler even sees the
fault.  This certainly gives you the
ability to debug each exception in the context in which it was thrown. style="mso-spacerun: yes">  But you are likely to see a lot of
exceptions in the debugger this way!


size=2> 


In fact,
throughout V1 of the runtime, the ASP.NET team ran all their stress suites with
a debugger attached and configured to trap on first-chance Access Violations
(“sxe av”).  Normally an AV in
managed code is converted to a NullReferenceException and then handled like any
other managed exception.  But
ASP.NET’s settings caused stress to trap in the debugger for any such AV. style="mso-spacerun: yes">  So their team enforced a rule that all
their suites (including all dependencies throughout FX) must avoid such
faults.


size=2> 


It’s an
approach that worked for them, but it’s hard to see it working more
broadly.


size=2> 


Instead,
over time we need to add new hooks to our debuggers so they can trap on just the
exceptions you care about.  This
might involve trapping exceptions that are escaping your code or are being
propagated into your code (for some definition of ‘your code’). style="mso-spacerun: yes">  Or it might involve trapping exceptions
that escape an AppDomain or that are propagated into an AppDomain.


size=2> 


The
above text has described a pretty complete managed exception model. style="mso-spacerun: yes">  But there’s one feature that’s
conspicuously absent.  There’s no
way for an API to document the legal set of exceptions that can escape from
it.  Some languages, like C++,
support this feature.  Other
languages, like Java, mandate it. 
Of course, you could attach Custom Attributes to your methods to indicate
the anticipated exceptions, but the CLR would not enforce this. style="mso-spacerun: yes">  It would be an opt-in discipline that
would be of dubious value without global buy-in and guaranteed
enforcement.


size=2> 


This is
another of those religious language debates. style="mso-spacerun: yes">  I don’t want to rehash all the reasons
for and against documenting thrown exceptions. style="mso-spacerun: yes">  I personally don’t believe the
discipline is worth it, but I don’t expect to change the minds of any
proponents.  It doesn’t
matter.


size=2> 


What
does matter is that disciplines like this must be applied universally to have
any value.  So we either need to
dictate that everyone follow the discipline or we must so weaken it that it is
worthless even for proponents of it. 
And since one of our goals is high productivity, we aren’t going to
inflict a discipline on people who don’t believe in it – particularly when that
discipline is of debatable value. 
(It is debatable in the literal sense, since there are many people on
both sides of the argument).


size=2> 


To me,
this is rather like ‘const’ in C++. 
People often ask why we haven’t bought into this notion and applied it
broadly throughout the managed programming model and frameworks. style="mso-spacerun: yes">  Once again, ‘const’ is a religious
issue.  Some developers are fierce
proponents of it and others find that the modest benefit doesn’t justify the
enormous burden.  And, once again,
it must be applied broadly to have value.


size=2> 


Now in
C++ it’s possible to ‘const-ify’ the low level runtime library and services, and
then allow client code to opt-in or not. 
And when the client code runs into places where it must lose ‘const’ in
order to call some non-const-ified code, it can simply remove ‘const’ via a
dirty cast.  We have all done this
trick, and it is one reason that I’m not particularly in favor of ‘const’
either.


size=2> 


But in a
managed world, ‘const’ would only have value if it were enforced by the
CLR.  That means the verifier would
prevent you from losing ‘const’ unless you explicitly broke type safety and were
trusted by the security system to do so. 
Until more than 80% of developers are clamoring for an enforced ‘const’
model throughout the managed environment, you aren’t going to see us added
it.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Foray into
C++ Exceptions


size=2> 


C++
exposes its own exception model, which is distinct from the __try / __except /
__finally exposure of SEH.  This is
done through auto-destruction of stack-allocated objects and through the ‘try’
and ‘catch’ keywords.  Note that
there are no double-underbars and there is no support for filters other than
through matching of exception types. 
Of course, under the covers it’s still SEH. style="mso-spacerun: yes">  So there’s still an FS:[0] handler (on
X86).  But the C++ compiler
optimizes this by only emitting a single SEH handler per method regardless of
how many try/catch/finally clauses you use. style="mso-spacerun: yes">  The compiler emits a table to indicate
to a common service in the C-runtime library where the various try, catch and
finally clauses can be found in the method body.


size=2> 


Of
course, one of the biggest differences between SEH and the C++ exception model
is that C++ allows you to throw and catch objects of types defined in your
application.  SEH only lets you
throw 32-bit exception codes.  You
can use _set_se_translator to map SEH codes into the appropriate C++ classes in
your application.


size=2> 


A large
part of the C++ exception model is implicit. style="mso-spacerun: yes">  Rather than use explicit try / finally /
catch clauses, this language encourages use of auto-destructed local
variables.  Whether the method
unwinds via a non-exceptional return statement or an exception being thrown,
that local object will auto-destruct.


size=2> 


This is
basically a ‘finally’ clause that’s been wrapped up in a more useful language
construct.  Auto-destruction occurs
during the second pass of SEH, as you would expect.


size=2> 


Have you
noticed that the C++ exception you throw is often a stack-allocated local? style="mso-spacerun: yes">  And that if you explicitly catch it,
this catch is also with a stack-allocated object? style="mso-spacerun: yes">  Did you ever wake up at night in a cold
sweat, wondering whether a C++ in-flight exception resides on a piece of stack
that’s already been popped?  Of
course not.


size=2> 


In fact,
we’ve now seen enough of SEH to understand how the exception always remains in a
section of the stack above ESP (i.e. within the bounds of the stack). style="mso-spacerun: yes">  Prior to the throw, the exception is
stack-allocated within the active frame. 
During the first pass of SEH, nothing gets popped. style="mso-spacerun: yes">  When the filters execute, they are
pushed deeper on the stack than the throwing frame.


size=2> 


When a
frame declares it will catch the exception, the second pass starts. style="mso-spacerun: yes">  Even here, the stack doesn’t
unwind.  Then, before resetting the
stack pointer, the C++ handler can copy-construct the original exception from
the piece of stack that will be popped into the activation frame that will be
uncovered.


size=2> 


If you
are an expert in unmanaged C++ exceptions, you will probably be interested to
learn of the differences between managed C++ exceptions and unmanaged C++
exceptions.  There’s a good write-up
of these differences at
href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcmex/html/vccondifferencesinexceptionhandlingbehaviorundermanagedexceptionsforc.asp"> face=Tahoma
size=2>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcmex/html/vccondifferencesinexceptionhandlingbehaviorundermanagedexceptionsforc.asp face=Tahoma size=2>.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">A Single
Managed Handler


size=2> 


We’ve
already seen how the C++ compiler can emit one SEH handler per method and reuse
it for all the exception blocks in that method. style="mso-spacerun: yes">  The handler can do this by consulting a
side table that indicates how the various clauses map to instruction sequences
within that method.


size=2> 


In the
managed environment, we can take this even further. style="mso-spacerun: yes">  We maintain a boundary between managed
and unmanaged code for many reasons, like synchronization with the garbage
collector, to enable stack crawling through managed code, and to marshal
arguments properly.  We have
modified this boundary to erect a single SEH handler at every unmanaged ->
managed call in.  For the most part,
we must do this without compiler support since many of our transitions occur
through dynamically generated machine code.


size=2> 


The cost
of modifying the SEH chain during calls into managed code is quickly amortized
as we call freely between managed methods. 
So the immediate cost of pushing FS:[0] handlers on method entry is
negligible for managed code.  But
there is still an impact on the quality of the generated code. style="mso-spacerun: yes">  We saw part of this impact in the
discussion of register usage across exception clauses to remain
GC-safe.


size=2> 


Of
course, the biggest cost of exceptions is when you actually throw one. style="mso-spacerun: yes">  I’ll return to this near the end of the
blog.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Flow
Control


Here’s
an interesting scenario that came up recently.


size=2> 


Let’s
say we drive the first pass of exception propagation all the way to the end of
the handler chain and we reach the unhandled exception backstop. style="mso-spacerun: yes">  That backstop will probably pop a dialog
in the first pass, saying that the application has suffered an unhandled
exception.  Depending on how the
system is configured, the dialog may allow us to terminate the process or debug
it.  Let’s say we choose
Terminate.


size=2> 


Now the
2nd pass begins.  During
the 2nd pass, all our finally clauses can execute.


size=2> 


What if
one of those 2nd pass ‘finally’ clauses throws a new exception? style="mso-spacerun: yes">  We’re going to start a new exception
propagation from this location – with a new Exception instance. style="mso-spacerun: yes">  When we drive this new Exception up the
chain, we may actually find a handler that will swallow the second
exception.


size=2> 


If this
is the case, the process won’t terminate due to that first exception. style="mso-spacerun: yes"> This is despite the fact that SEH told
the user we had an unhandled exception, and the user told us to terminate the
process.


size=2> 


This is
surprising, to say the least.  And
this behavior is possible, regardless of whether managed or unmanaged exceptions
are involved.  The mechanism for SEH
is well-defined and the exception model operates within those rules. style="mso-spacerun: yes">  An application should avoid certain
(ab)uses of this mechanism, to avoid confusion.


size=2> 


Indeed,
we have prohibited some of those questionable uses in managed code.


size=2> 


In
unmanaged, you should never return from a finally. style="mso-spacerun: yes">  In an exceptional execution of a
finally, a return has the effect of terminating the exception processing. style="mso-spacerun: yes">  The catch handler never sees its
2nd pass and the exception is effectively swallowed. style="mso-spacerun: yes">  Conversely, in a non-exceptional
execution of a finally, a return has the effect of replacing the method’s return
value with the return value from the finally. style="mso-spacerun: yes">  This is likely to cause developer
confusion.


size=2> 


So in
managed code we’ve made it impossible for you to return from a finally
clause.  The full rules for flow
control involving managed exception clauses should be found at Section 12.4.2.8
of ECMA Partition I (
face=Tahoma size=2>http://msdn.microsoft.com/net/ecma/ face=Tahoma size=2>).


size=2> 


However,
it is possible to throw from a managed finally clause. style="mso-spacerun: yes">  (In general, it’s very hard to
confidently identify regions of managed code where exceptions cannot be
thrown).  And this can have the
effect of replacing the exception that was in flight with a new 1st
and 2nd pass sweep, as described above. style="mso-spacerun: yes">  This is the ExceptionCollidedUnwind
situation that is mentioned in the EXCEPTION_DISPOSITION enumeration.


size=2> 


The C++
language takes a different approach to exceptions thrown from the 2nd
pass.  We’ve already seen that C++
autodestructors execute during the 2nd pass of exception
handling.  If you’ve ever thrown an
exception from the destructor, when that destructor is executed as part of an
exception unwind, then you have already learned a painful lesson. style="mso-spacerun: yes">  The C++ behavior for this situation is
to terminate the process via a termination handler.


size=2> 


In
unmanaged C++, this means that developers must follow great discipline in the
implementation of their destructors. 
Since eventually those destructors might run in the context of exception
backout, those destructors should never allow an exception to escape them. style="mso-spacerun: yes">  That’s painful, but presumably
achievable.


size=2> 


In
managed C++, I’ve already mentioned that it’s very hard to identify regions
where exceptions cannot occur.  The
ability to prevent (asynchronous and resource) exceptions over limited ranges of
code is something we would like to enable at some point in the future, but it
just isn’t practical in V1 and V1.1. 
It’s way too easy for an out-of-memory or type-load or
class-initialization or thread-abort or appdomain-unload or similar exception to
intrude.


size=2> 


Finally,
it’s possible for exceptions to be thrown during execution of a filter. style="mso-spacerun: yes">  When this happens in an OS SEH context,
it results in the ExceptionNestedException situation that is mentioned in the
EXCEPTION_DISPOSITION enumeration. 
The managed exception model took a different approach here. style="mso-spacerun: yes">  We’ve already seen that an MSVC filter
clause has three legal returns values (resume execution, continue search, and
execute handler).  If a managed
filter throws an exception, we contain that exception and consider the filter to
have replied “No, I don’t want to handle this one. style="mso-spacerun: yes">  Continue searching for a
handler”.


size=2> 


This is
a reasonable interpretation in all cases, but it falls out particularly well for
stack overflow.  With the historical
OS support for stack overflow, it’s very hard to reliably execute backout
code.  As I’ve mentioned in other
blogs, you may only have one 4K page of stack available for this purpose. style="mso-spacerun: yes">  If you blow that page, the process is
terminated.  It’s very hard to
execute managed filters reliably within such a limited region. style="mso-spacerun: yes">  So a reasonable approach is to consider
the filters to have themselves thrown a StackOverflowException and for us to
interpret this as “No, I don’t want to handle this one.”


size=2> 


In a
future version, we would like to provide a more defensible and useful mechanism
for handling stack overflow from managed code.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Error
Handling without Exceptions


size=2> 


So we’ve
seen how SEH and C++ and managed exceptions all interoperate. style="mso-spacerun: yes">  But not all error handling is based on
exceptions.  When we consider
Windows, there are two other error handling systems that the CLR can
interoperate with.  These are the
Get/SetLastError mechanism used by the OS and the HRESULT / IErrorInfo mechanism
used by COM.


size=2> 


Let’s
look at the GetLastError mechanism first, because it’s relatively simple. style="mso-spacerun: yes">  A number of OS APIs indicate failure by
returning a sentinel value.  Usually
this sentinel value is -1 or 0 or 1, but the details vary depending on the
API.  This sentinel value indicates
that the client can call GetLastError() to recover a more detailed OS status
code.  Unfortunately, it’s sometimes
hard to know which APIs participate in the GetLastError protocol. style="mso-spacerun: yes">  Theoretically this information is always
documented in MSDN and is consistent from one version of the OS to the next –
including between the NT and Win95-based OSes.


size=2> 


The real
issue occurs when you PInvoke to one of these methods. style="mso-spacerun: yes">  The OS API latches any failure codes
with SetLastError.  Now on the
return path of the PInvoke, we may be calling various OS services and managed
services to marshal the outbound arguments. style="mso-spacerun: yes">  We may be synchronizing with a pending
GC, which could involve a blocking operation like WaitForSingleObject. style="mso-spacerun: yes">  Somewhere in here, we may call another
OS API that itself latches an error code (or the absence of an error code)
through its own call to SetLastError.


size=2> 


So by
the time we return to some managed code that can generate up a new PInvoke stub
to call GetLastError, you can be sure that the original error code is long
gone.  The solution is to tag your
PInvoke declaration to indicate that it should participate in the GetLastError
protocol.  This tells the PInvoke
call to capture the error as part of the return path, before any other OS calls
on this thread have an opportunity to erase it or replace it.


size=2> 


This
protocol works well for PInvokes. 
Unfortunately, we do not have a way to tag IJW VTFixup stubs in the same
way.  So when you make managed ->
unmanaged calls via MC++ IJW, there isn’t a convenient and reliable way to
recover a detailed OS status code on the return path. style="mso-spacerun: yes">  Obviously this is something we would
like to address in some future version, though without blindly inflicting the
cost of a GetLastError on all managed -> unmanaged transitions through
IJW.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">COM Error
Handling


size=2> 


To
understand how the CLR interoperates with COM HRESULTs, we must first review how
PreserveSig is used to modify the behavior of PInvoke and COM
Interop.


size=2> 


size=2>Normally, COM signatures return an HRESULT error code. style="mso-spacerun: yes">  If the method needs to communicate some
other result, this is typically expressed with an [out, retval] outbound
argument.  Of course, there are
exceptions to this pattern.  For
example, IUnknown::AddRef and Release both return a count of the outstanding
references, rather than an HRESULT. 
More importantly, HRESULTs can be used to communicate success codes as
well as error codes.  The two most
typical success codes are S_OK and S_FALSE, though any HRESULT with the high bit
reset is considered a success code.


size=2> 


COM
Interop normally transforms the unmanaged signature to create a managed
signature where the [out, retval] argument becomes the managed return
value.  If there is no [out,
retval], then the return type of the managed method is ‘void’. style="mso-spacerun: yes">  Then the COM Interop layer maps between
failure HRESULTs and managed exceptions. 
Here’s a simple example:


size=2> 


style="FONT-FAMILY: 'Lucida Console'">COM: style="mso-tab-count: 1">  HRESULT GetValue([out, retval] IUnknown
**ppRet)


style="FONT-FAMILY: 'Lucida Console'">CLR: style="mso-tab-count: 1">  IUnknown
GetValue()


size=2> 


However,
the return value might be a DWORD-sized integer that should not be interpreted
as an HRESULT.  Or it might be an
HRESULT – but one which must sometimes distinguish between different success
codes.  In these cases, PreserveSig
can be specified on the signature and it will be preserved on the managed side
as the traditional COM signature.


size=2> 


Of
course, the same can happen with PInvoke signatures. style="mso-spacerun: yes">  Normally a DLL export like Ole32.dll’s
CoGetMalloc would have its signature faithfully preserved. style="mso-spacerun: yes">  Presumably the transformation would be
something like this:


size=2> 


style="FONT-FAMILY: 'Lucida Console'">DLL: style="mso-tab-count: 1">  HRESULT CoGetMalloc(DWORD c, [out,
retval] IMalloc **ppRet)


style="FONT-FAMILY: 'Lucida Console'">CLR: style="mso-tab-count: 1">  DWORD style="mso-spacerun: yes">  CoGetMalloc(DWORD c, ref IMalloc
ppRet)


size=2> 


If OLE32
returns some sort of failure HRESULT from this call, it will be returned to the
managed caller.  If instead the
application would prefer to get this error case automatically converted to a
managed Exception, it can use PreserveSig to indicate this.


size=2> 


size=2>Huh?  In the COM case
PreserveSig means “give me the unconverted HRESULT signature”, but in the
PInvoke case PreserveSig means “convert my HRESULTs into exceptions.” style="mso-spacerun: yes">  Why would we use the same flag to
indicate exactly opposite semantics for these two interop layers? style="mso-spacerun: yes">  The reasons are, ahem, historical. style="mso-spacerun: yes">  The best way to think of PreserveSig is
“give me the unusual transformation of my signature, as opposed to what is
typical for the kind of interop I am doing.”


size=2> 


So now
we know how to obtain mappings between HRESULTs and managed exceptions for the
typical COM Interop case (no PreserveSig) and the atypical PInvoke case
(PreserveSig).  But what are the
details of that mapping?


size=2> 


The
exception subsystem in the CLR has mappings between COM errors, OS errors, and
managed exception types.


size=2> 


Of
course, sometimes we have a situation which doesn’t have a precise mapping. style="mso-spacerun: yes">  In the case of an HRESULT that isn’t
associated with a specific managed Exception class, we convert it to an instance
of COMException.  In the case of an
OS status code that isn’t associated with a specific managed Exception class, we
convert it to an instance of SEHException.


size=2> 


Even for
cases where we have a correspondence between a managed and unmanaged
representation, the mapping won’t necessarily roundtrip. style="mso-spacerun: yes">  For instance, an AV in unmanaged code
results in an SEH exception of code 0xC0000005. style="mso-spacerun: yes">  If this is driven through managed code,
it will be mapped to the corresponding NullReferenceException class. style="mso-spacerun: yes">  If the propagation of this exception
continues through managed code and further up the stack to an unmanaged SEH
handler, the unmanaged code will see the original exception code of
0xC0000005.  So, when propagating
through that sequence of handlers, we see a perfect roundtrip.


size=2> 


But
let’s change the scenario slightly, so that the original AccessViolation occurs
in managed code.  Now we have a
NullReferenceException that is being propagated out to an unmanaged SEH handler
further back on the stack.  But this
time the NullReferenceException will be mapped to an SEH exception code of
0xE0434F4D.  This is the managed
exception code used for all managed exceptions.


size=2> 


Have you
ever wondered where these exception codes come from? style="mso-spacerun: yes">  Well 0xE0434F4D is 0xE0+“COM”. style="mso-spacerun: yes">  Originally the CLR was called COM+
2.0.  When we changed the project
name, we neglected to change the exception code. style="mso-spacerun: yes">  The unmanaged C++ exceptions use
0xE06D7363, which is 0xE0+“msc”. 
You might also see 0xE0524F54 for 0xE0+“ROT” on Rotor builds.


size=2> 


The
current mapping between OS status codes and managed exception types is quite
limited.  It contains standard
transformations like:


size=2> 


size=2>STATUS_FLOAT_INEXACT_RESULT


size=2>STATUS_FLOAT_INVALID_OPERATION


size=2>STATUS_FLOAT_STACK_CHECK


size=2>STATUS_FLOAT_UNDERFLOW style="mso-tab-count: 2">               
=> ArithmeticException


size=2> 


size=2>STATUS_FLOAT_OVERFLOW


size=2>STATUS_INTEGER_OVERFLOW style="mso-tab-count: 2">              
=> OverflowException


size=2> 


size=2>STATUS_FLOAT_DIVIDE_BY_ZERO


size=2>STATUS_INTEGER_DIVIDE_BY_ZERO style="mso-tab-count: 1">      =>
DivideByZeroException


size=2> 


size=2>STATUS_FLOAT_DENORMAL_OPERAND 
=> FormatException


size=2> 


size=2>STATUS_ACCESS_VIOLATION style="mso-tab-count: 2">                
=> NullReferenceException


size=2> 


size=2>STATUS_ARRAY_BOUNDS_EXCEEDED style="mso-tab-count: 1">     =>
IndexOutOfRangeException


size=2> 


size=2>STATUS_NO_MEMORY style="mso-tab-count: 3">                          
=> OutOfMemoryException


size=2> 


size=2>STATUS_STACK_OVERFLOW style="mso-tab-count: 2">                 
=> StackOverflowException


size=2> 


The
HRESULT mappings are far more extensive. 
They include standard mappings to the well-known HRESULT values
like:


size=2> 


size=2>E_POINTER style="mso-tab-count: 4">                                          
=> ArgumentNullException


size=2> 


And they
include mappings to CLR-defined HRESULTs in the 0x8013???? range that you’ve
doubtless witnessed during your development and debugging. style="mso-spacerun: yes">  The managed platform has its own
facility code for reserving a range of HRESULTs for our exclusive
use.


size=2> 


size=2>COR_E_ENTRYPOINTNOTFOUND style="mso-tab-count: 1">           
=> EntryPointNotFoundException


size=2> 


And our
mappings include a gathering of similar HRESULTs to a single managed
exception.  Here’s a particularly
extensive gathering of 26 different HRESULTs to the FileLoadException
class:


size=2> 


size=2>FUSION_E_REF_DEF_MISMATCH


size=2>FUSION_E_INVALID_PRIVATE_ASM_LOCATION


size=2>COR_E_ASSEMBLYEXPECTED


size=2>FUSION_E_SIGNATURE_CHECK_FAILED


size=2>FUSION_E_ASM_MODULE_MISSING


size=2>FUSION_E_INVALID_NAME


size=2>FUSION_E_PRIVATE_ASM_DISALLOWED


size=2>COR_E_MODULE_HASH_CHECK_FAILED


size=2>COR_E_FILELOAD


size=2>SECURITY_E_INCOMPATIBLE_SHARE


size=2>SECURITY_E_INCOMPATIBLE_EVIDENCE


size=2>SECURITY_E_UNVERIFIABLE


size=2>COR_E_FIXUPSINEXE


size=2>HRESULT_FROM_WIN32(ERROR_TOO_MANY_OPEN_FILES)


size=2>HRESULT_FROM_WIN32(ERROR_SHARING_VIOLATION)


size=2>HRESULT_FROM_WIN32(ERROR_LOCK_VIOLATION)


size=2>HRESULT_FROM_WIN32(ERROR_OPEN_FAILED)


size=2>HRESULT_FROM_WIN32(ERROR_DISK_CORRUPT)


size=2>HRESULT_FROM_WIN32(ERROR_UNRECOGNIZED_VOLUME)


size=2>HRESULT_FROM_WIN32(ERROR_FILE_INVALID)


size=2>HRESULT_FROM_WIN32(ERROR_DLL_INIT_FAILED)


size=2>HRESULT_FROM_WIN32(ERROR_FILE_CORRUPT)


size=2>FUSION_E_CODE_DOWNLOAD_DISABLED


size=2>CORSEC_E_MISSING_STRONGNAME


size=2>INIT_E_DOWNLOAD_FAILURE


size=2>MSEE_E_ASSEMBLYLOADINPROGRESS style="mso-tab-count: 1">    =>
FileLoadException


size=2> 


There
are some more observations we can make about the COM error handling
approach.  First, it should be
obvious that the 32-bits of an HRESULT cannot uniquely define an arbitrary set
of user-extensible error conditions. 
COM deals with this, in part, by including the interface that returns an
HRESULT in the decision of how to interpret these 32-bits. style="mso-spacerun: yes">  This means that 0xE3021051 returned from
IMyInterface is not the same error code as 0xE3201051 returned from
IYourInterface.  Unfortunately, it
also means that each interface must be rigorous about the bit patterns it
returns.  Specifically, it would be
very bad if the implementation of IMyInterface::m() happens to delegate to
IYourInterface::n() and blindly return ‘n’s HRESULTs. style="mso-spacerun: yes">  Any HRESULTs returned from ‘n’ must
somehow be mapped to the bit patterns that are legal to return from
IMyInterface::m().  If ‘n’ returns a
bit pattern that IMyInterface::m() cannot map, then ‘m’ is obligated to convert
the HRESULT to E_UNEXPECTED and return that.


size=2> 


In other
words, the uniqueness constraint for HRESULTs forces a painful discipline on all
COM implementations that return HRESULTs. 
And part of this discipline is to lose error information by mapping
meaningful HRESULTs into E_UNEXPECTED if the context for interpreting those
HRESULTs is being lost.  (There is a
well-defined set of system HRESULTs which are implicitly returnable from any
interface.  The bit pattern for
E_UNEXPECTED is necessarily part of this set. style="mso-spacerun: yes">  The CLR facility code allows us to live
in this privileged world with our own codes).


size=2> 


The fact
that most COM developers are unaware of this painful discipline and don’t follow
it, just adds to the level of pain here.


size=2> 


size=2>Fortunately, COM supplements the limited expressibility and uniqueness of
HRESULTs by using a second mechanism: IErrorInfo. style="mso-spacerun: yes">  And the COM Interop layer uses this
supplementary mechanism when mapping to and from managed exception objects. style="mso-spacerun: yes">  In fact, System.Exception implements the
IErrorInfo interface.  When a
managed exception is thrown to a COM client, the IErrorInfo of the Exception
instance is available for the COM client to query.


size=2> 


Adam
Nathan’s excellent book “.NET and COM – The Complete Interoperability Guide”
describes how the IErrorInfo state is filled in from a managed exception in
Chapter 16.


size=2> 


There’s
one more detail of COM Interop HRESULT mapping that warrants discussion. style="mso-spacerun: yes">  It’s good practice for all COM methods
to return an HRESULT.  But there are
several famous violations of this rule, including IUnknown::AddRef and
Release.  More importantly, every
developer can choose whether to follow this best practice. style="mso-spacerun: yes">  Some choose not to. style="mso-spacerun: yes">  And there are some typical cases, like
event sinks, where we often see methods returning ‘void’ or ‘bool’.


size=2> 


This
presents the COM Interop error mapping layer with a problem. style="mso-spacerun: yes">  If an exception occurs inside a managed
implementation of a method with one of these signatures, it’s hard to convey the
error information back to the COM caller. 
There are several choices available to that layer – none of them
good:


size=2> 


style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; tab-stops: list .5in; mso-list: l4 level1 lfo5"> style="mso-fareast-font-family: Tahoma; mso-bidi-font-family: Tahoma"> style="mso-list: Ignore">1) style="FONT: 7pt 'Times New Roman'">     
Allow the managed exception to
travel back through the COM caller, using the underlying SEH mechanism. style="mso-spacerun: yes">  This would work perfectly, but is
strictly illegal.  Well-behaved COM
servers do not propagate exceptions out to their COM clients.


size=2> 


style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; tab-stops: list .5in; mso-list: l4 level1 lfo5"> style="mso-fareast-font-family: Tahoma; mso-bidi-font-family: Tahoma"> style="mso-list: Ignore">2) style="FONT: 7pt 'Times New Roman'">     
Swallow the managed
exception.  Propagate a return value
with ‘0’ out to the COM client. 
This 0 value might get interpreted as a returned Boolean, integer, pUnk
or other data type.  In the case of
a ‘void’ signature, it will simply be ignored.


size=2> 


style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: -0.25in; tab-stops: list .5in; mso-list: l4 level1 lfo5"> style="mso-fareast-font-family: Tahoma; mso-bidi-font-family: Tahoma"> style="mso-list: Ignore">3) style="FONT: 7pt 'Times New Roman'">     
Convert the exception object into
an HRESULT value.  Propagate that
HRESULT out as the return value to the COM client. style="mso-spacerun: yes">  In the ‘void’ case, this will again be
ignored.  In the pUnk case, it will
likely be dereferenced and subsequently cause an AccessViolation. style="mso-spacerun: yes">  (Failure HRESULTs have the high bit
set.  On Win32 the high 2 GB of
address space are reserved for the kernel and are unavailable unless you run a
/LARGEADDRESSAWARE process on a suitably booted system. style="mso-spacerun: yes">  On Win64, the low couple of GB of
address are reserved and unavailable to detect this sort of mistake).


size=2> 


As you
can see, all of these solutions are broken. style="mso-spacerun: yes">  Unfortunately, the most broken of the
three is the last one… and that’s the one we currently follow. style="mso-spacerun: yes">  I suspect we will change our behavior
here at some point.  Until then, we
rely on the fact that AddRef & Release are specially handled and that the
other cases are rare and are typically ‘void’ or ‘bool’ returns.


size=2> 


size=2> 


style="mso-bidi-font-weight: normal">Performance
and Trends


size=2> 


size=2>Exceptions vs. error codes has always been a controversial topic. style="mso-spacerun: yes">  For the last 15 years, every team has
argued whether their codebase should throw exceptions or return error
codes.  Hopefully nobody argues
whether their team should mix both styles. 
That’s never desirable, though it often takes major surgery to migrate to
a consistent plan.


size=2> 


With any
religious controversy, there are many arguments on either side. style="mso-spacerun: yes">  Some of them are related to:


size=2> 



  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l1 level1 lfo6"> face=Tahoma size=2>A philosophy of what errors mean and whether they should be
    expressed out-of-band with the method contract.

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l1 level1 lfo6"> face=Tahoma size=2>Performance. 
    Exceptions have a direct cost when you actually throw and catch an
    exception.  They may also have an
    indirect cost associated with pushing handlers on method entry. style="mso-spacerun: yes">  And they can often have an insidious
    cost by restricting codegen opportunities.

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l1 level1 lfo6"> face=Tahoma size=2>It’s relatively easy to forget to check for a returned
    error code.  It’s much harder to
    inadvertently swallow an exception without handling it (though we still find
    developers doing so!)

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l1 level1 lfo6"> face=Tahoma size=2>Exceptions tend to capture far more information about the
    cause and location of an error, though one could envision an error code system
    that’s equally powerful. 
    (IErrorInfo anybody?)

size=2> 


So
what’s the right answer here?


size=2> 


Well if
you are building the kernel of an operating system, you should probably use
error codes.  You are a programming
God who rarely makes mistakes, so it’s less likely that you will forget to check
your return codes.  And there are
sound bootstrapping and performance reasons for avoiding exceptions within the
kernel.  In fact, some of the OS
folks here think that SEH should be reserved for terrible “take down the
process” situations.  That may have
been the original design point.  But
SEH is such a flexible system, and it is so entrenched as the basis for
unmanaged C++ exceptions and managed exceptions, that it is no longer reasonable
to restrict the mechanism to these critical failures.


size=2> 


So, if
you are not a programming God like those OS developers, you should consider
using exceptions for your application errors. style="mso-spacerun: yes">  They are more powerful, more expressive,
and less prone to abuse than error codes. 
They are one of the fundamental ways that we make managed programming
more productive and less error prone. style="mso-spacerun: yes"> In fact, the CLR internally uses
exceptions even in the unmanaged portions of the engine. style="mso-spacerun: yes">  However, there is a serious long term
performance problem with exceptions and this must be factored into your
decision.


size=2> 


Consider
some of the things that happen when you throw an exception:


size=2> 



  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l3 level1 lfo7"> face=Tahoma size=2>Grab a stack trace by interpreting metadata emitted by the
    compiler to guide our stack unwind.

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l3 level1 lfo7"> size=2>Run through a chain of handlers up the stack, calling
    each handler twice. 

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l3 level1 lfo7"> face=Tahoma size=2>Compensate for mismatches between SEH, C++ and managed
    exceptions.

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l3 level1 lfo7"> face=Tahoma size=2>Allocate a managed Exception instance and run its
    constructor.  Most likely, this
    involves looking up resources for the various error messages.

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l3 level1 lfo7"> face=Tahoma size=2>Probably take a trip through the OS kernel. style="mso-spacerun: yes">  Often take a hardware
    exception.

  • style="MARGIN: 0in 0in 0pt; tab-stops: list .5in; mso-list: l3 level1 lfo7"> face=Tahoma size=2>Notify any attached debuggers, profilers, vectored
    exception handlers and other interested parties.

size=2> 


This is
light years away from returning a -1 from your function call. style="mso-spacerun: yes">  Exceptions are inherently non-local, and
if there’s an obvious and enduring trend for today’s architectures, it’s that
you must remain local for good performance.


size=2> 


Relative
to straight-line local execution, exception performance will keep getting
worse.  Sure, we might dig into our
current behavior and speed it up a little. 
But the trend will relentlessly make exceptions perform worse.


size=2> 


How do I
reconcile the trend to worse performance with our recommendation that managed
code should use exceptions to communicate errors? style="mso-spacerun: yes">  By ensuring that error cases are
exceedingly rare.  We used to say
that exceptions should be used for exceptional cases, but folks pushed back on
that as tautological.


size=2> 


If your
API fails in 10% of all calls, you better not use an exception. style="mso-spacerun: yes">  Instead, change the API so that it
communicates its success or failure as part of the API (e.g. ‘bool
TryParse(String s)’).  Even if the
API fails 1% of calls, this may be too high a rate for a service that’s heavily
used in a server.  If 1% of calls
fail and we’re processing 1000 requests per second with 100 of these API calls
per request, then we are throwing 1000 times a second. style="mso-spacerun: yes">  That’s a style="mso-bidi-font-style: normal">very disturbing rate of exceptions. style="mso-spacerun: yes">  On the other hand, a 1% failure rate may
be quite tolerable in a client scenario, if the exception occurs when a human
user presses the wrong button.


size=2> 


size=2>Sometimes you won’t know whether your API will be used in a client or a
server.  And it may be hard for you
to predict failure rates when errors are triggered by bad data from the
client.  If you’ve provided a way
for the client to check his data without triggering an exception (like the
TryParse() example above) then you’ve done your part.


size=2> 


As
usual, there’s so much more to say. 
I still haven’t talked about unhandled exceptions. style="mso-spacerun: yes">  Or about undeniable exception
propagation (Thread.Abort).  Or how
undeniable propagation interacts with propagation through unmanaged code via
PInvoke, IJW or COM Interop.  And I
carefully avoided explaining why we didn’t follow our own rules when defining
and using the Exception class hierarchy. 
And there’s plenty to say about our special treatment of
OutOfMemoryException and StackOverflowException.


size=2> 


If you
are still reading and actually want to know more, perhaps you should just apply
for a job on the CLR team.

Comments (92)

  1. haacked says:

    Are they hiring?

  2. Anonymous says:

    Dude you really need to write a book!

  3. Chris, I enjoy blogging but I barely find the time to do it. How do you find the time to write small dissertations? 😉 Nice article as always. I plan on reading it in detail later.
    -Mathew Nolton

  4. Excellent article, as always! Thank you for discussing briefly how Thread.Abort interacts with the JIT. I, for one, am interested in the undeniable exception propagation (Thread.Abort) as I have fought the ThreadAbortException problem before. I have seen it documented in the Rotor (SSCLI) code, and tried to understand what was going on and why.

  5. Anonymous says:

    Aww, we have to get hired to learn more? Seems a bit over the top to have to move to the states just to learn about exceptions. 🙂

    Keep the articles long I say. Readers can always skim them if they don’t want to know the details, but if you do want to know the details and there’s only a short article, you’re out of luck. Better too long than too short.

  6. Jeroen says:

    I agree with the previous poster. I can find the general facts in the MSDN docs, blogs like these are great for getting the gritty details when you really want to know how something works. Thanks alot for writing stuff like this!

  7. I should write shorter blogs.

    Please don’t!

    Your going to love my exception handling in IKVM.NET (not!). I think I violate all of the best practices.

    Since Java compiles try { } finally {} as try {} catch() {} (and also allows returning from a finally block), I cannot use (at least it would be difficult) the CLR finally construct.

    I also create way too many exception handlers, because in Java it is legal to branch in to and out of exception handlers, so I have to split them. For a particularly nasty example, check out http://www.frijters.net/run.il

    (the Java source is at: http://www.frijters.net/run.txt)

  8. John Cavnar-Johnson says:

    Don’t waste any time trying to make your blogs shorter. The information density in your entries is the highest I’ve ever seen. There is no fluff to be cut. You are tackling complex issues at an incredibly detailed level and you have an excellent sense for what a single entry should be. You could conceivably divide this article at your section headings, but what’s the point? One of the benefits of blogging is that there are no artificial constraints on length (unlike a book or an MSDN article).

    I was an English major in college and an editor before becoming a software geek. Very few technical writers (even with good editors) can achieve the kind of clarity I see in your articles. Your target audience is small, but it is one that is very important to the success of .NET.

  9. Frank Hileman says:

    Please keep the long blogs. I don’t think any of your readers want shorter ones. Your internal reviewers were probably thinking of a different kind of audience.

    Your comments on the cost of exception throw/catch has reinforced my feeling that exceptions are are too frequently being recommended as an alternative to return values. I have seen how exception catching can bog down applications.

    I think that when designing a commonly used API, an exception should not be thrown when performing an operation, if there is no alternative way to determine that the exception will be thrown. That is, throwing an exception when passed a null reference is acceptable, because the caller can easily check for a null reference. But if an algorithm within a function throws an exception, and there is no way to pre-determine the exception will be thrown, without the API user duplicating the algorithm externally, an alternative, non-exception throwing function should be provided, or a return value should communicate failure instead. It is impossible for the API designer to predict the context in which the function will be used, and how often the exception may be thrown.

    Brad Adams started some discussions like this here:
    http://blogs.gotdotnet.com/brada/commentview.aspx/c9c61dbf-62a9-474f-a5fe-c171cdedb4f6

  10. Sean McLeod says:

    On the topic of buffer overruns on the stack and getting code to execute from the stack, how come Intel haven’t added an option in their newer CPUs to allow the OS to mark the stack VM pages with Read/Write permission but ‘No Execute’ permission?

  11. Steve says:

    Excellent article Chris – complete with a flashback or two from the past (I tried a ‘Jonny Mnemonic’ erasure on WinNT SEH)

    Thanks!

  12. Steve says:

    Excellent article Chris – complete with a flashback or two from the past (I tried a ‘Jonny Mnemonic’ erasure on WinNT SEH)

    Thanks!

  13. Mike Dimmick says:

    Sean: about Intel (i.e. x86) CPUs.

    Firstly, I don’t think there’s room in the page table entry for an additional flag, you’d have to persuade the other manufacturers to do it too, and obviously this would only help on new processors (which the OS would have to detect correctly). IIRC, Windows already uses all three bits (11 to 9) of the page table entry for its own purposes – although I’d have to check Inside Windows 2000 to be sure.

    The x86 processor only executes code through the memory pointed to by the code segment register, but a segment can only be a contiguous region of virtual memory – it can’t consist of multiple non-contiguous regions. Existing programs expect not to have to manipulate the CS register; Windows currently sets all the segment registers apart from FS to point to segments with appropriate permissions set, but which address all 4GB of virtual memory (i.e. offset = 0, size = 0xFFFFFFFF). As Chris said, the FS register points to the Thread Information Block (actually, it points to an entry in the Segment Descriptor Table which describes a segment whose starting offset is the start address of the TIB). The FS register is the only segment register which gets used by typical Windows application code. Everything else is referenced by its default selector – all code is accessed through the CS selector, data through DS, stack through SS and string-destinations (for string-copy instructions such as MOVS) through ES. AMD64 ignores CS, DS, ES and SS in 64-bit mode; FS and GS still exist and are widened to reference a 64-bit base address, although the limit is now ignored.

    The new processor architectures, AMD64 and IA64, both permit execute/no execute permissions to be set on a per-page basis.

    Also, some libraries use areas of the stack for thunk code. I’m thinking particularly of ATL 6.0, which uses a dynamically-generated thunk as the actual window procedure passed to Windows. The thunk replaces the hWnd argument with the ‘this’ pointer of the referenced object, then calls the class’s static WindowProc function. Since the thunk is declared as a member of CWindowImplRoot, if you allocate a window object on the stack (perhaps for a modal dialog), you end up with the thunk on the stack requiring execute permissions.

    I note that in ATL 7.x for x86, the thunk will actually now be allocated in the process’s default heap rather than on the stack, although it’s still on the stack for other processors. See ATLWIN.H for CWndProcThunk and ATLBASE.H for CStdCallThunk (ATL 7.x only).

  14. Erick Sgarbi says:

    " I should write shorter blogs"
    Well, I believe Lincoln has said once, "I did not have time to write you a short message so I merely wrote a long one" which means that takes time to write something short and meaniful. I am sure you’re a very busy person yourself, which is taking your own private time to speak your mind and helping all of us (CLR wolves) to understand better this machinery.

    Put this way, if you take your time to write a very long article I will ensure to make myself time to read it full.

    Great work!

  15. Sean McLeod says:

    x86 no-execute VM pages.

    Yes it would only apply to newer cpus, but at 100 million plus new PCs per year the sooner it got in the more useful it would be.

    In terms of some code like ATL actually generating executable code on the stack etc. the option is to allow it to be an option per process so that a minimum ‘critical’ processes hosting things like rpcss etc. could run with this extra level of safety.

    Does anyone know whether the 64 bit versions of Windows use the execute permission bit to prevent the execution of code from certain pages under any circumstances?

    Cheers

  16. Chris Brumme says:

    I agree that taking advantage of "No Execute" pages on modern CPUs is essentially a no-brainer, in light of the current (and future) security environment. If I had to guess, it would be more a matter of when rather than if. Obviously a lot of software would have to change before we could support this broadly. For instance, the CLR itself is loaded into a lot of processes. And on X86 the CLR executes out of random heap blocks. I wouldn’t be surprised to find that we execute out of the stack in some cases, also. But we already needed to clean this up for 64-bit, so the incremental work for 32-bit may not be too bad.

  17. Chris Brumme says:

    It’s hard to know whether the question "Are they hiring?" was serious. Generally, Microsoft is always hiring. And I think the CLR team has had a couple of open head count pretty constantly for many years. The actual jobs change over time, between the different sub-teams (security, debugging, perf, etc.) and the different disciplines (dev, architect, mgmt, tester & pm). I suspect we’re a bit more rigorous & painful on our interviews than the typical Microsoft team. But if you are really serious about a job on the CLR, by all means email me directly. (If you can’t decode my email address from the NOSPAM version, you’ve already failed the first test!)

  18. Rick Byers says:

    Another excellent paper Chris! I agree you shouldn’t worry about making them any shorter. People are used to reading short blog entries, but I see your posts more like journal articles/papers. I print them out and read them during my commute, instead of reading them on-line like other blogs. You should definitely consider collecting your posts into book form at some point!

    But in a managed world, ‘const’ would only have value if it were enforced by the CLR.

    I disagree. Sure there would be incredible value in having the CLR enforce const-ness, but as you say the cost-benefit tradeoff is too high for the average case. However, I believe there would also be incredible value in having unverified const method parameters that are checked only at compile time.

    Generally, during development the caller of a method needs to know if the reference types being pased into the method will be modified in any way. For example, if they’re going to be modified, the caller may not be able to re-use it next time the current operation needs to be performed. More generally, knowing when objects are modified and when they’re not (or aren’t supposed to be) is necessary to keep track of the state of an algorithm.

    Today in .NET, this is specified in the API documentation. For example, the static method Array.Clear indicates the Array argument is modified, but Array.BinarySearch implies its argument is not. Of course the description of the method usually makes it clear, but that’s not always the case. It’s also common, even in high quality API references like the .NET MSDN docs, for the documentation to omit this detail.

    Since the programmer needs to know this information when manipulating the code, why not make the documentation explicit with machine-readable meta-data? If I declare some method ‘void Foo( const MyType m )’, I’m just saying that Foo doesn’t intend to modify the state of m. If I change my mind later and decide I am going to modify ‘m’, I’d like some automated way to determine (within my application) which of my callers are relying on the previous behaviour. With compile-time verified ‘const’, any such callers would now get a compiler error (or even just a warning). Of course, I’m not advocating semantic changes to an API once its in a production environment where versioning is an issue. But during development and refactoring, it’s incredibly important to be able to identify the call sites that will be affected by a semantic change. I’ve seen many developers ‘#if false’ out a method (or apply System.ObsoleteAttribute) just to accurately find all the static references to the method within the solution (VS.NET’s ‘Go to Reference’ is horribly inadequate here) – but that’s a much bigger problem.

    My point is just that many people find ‘const’ useful in C++, despite the fact it’s not enforced. The C++ standard says that the results of using const_cast<> are implementation-defined, but in my experience implementations intentionally allow the use of const_cast without any side effects, specifically because people use it as I’ve described. Many people that make religious use of ‘const’ in C++ find it to be a useful tool in API design and bug prevention.

    Here is a great example of the problem that has actually caused me real grief in the distributed-computing platform I work on (http://www.kinitos.com/). I had code something like this (simplified for demonstration purposes):

    IClientChannelSinkProvider ccsp = new BinaryClientFormatterSinkProvider();
    ccsp.Next = new MyCustomProvider();
    if( <we need a Tcp channel> ) {
    channel = new TcpChannel( properties, ccsp, null );
    RegisterChannel( channel );
    }

    if( <we need an Http channel> ) {
    channel = new HttpChannel( properties, ccsp, null );
    RegisterChannel( channel );
    }

    This code works great if we’re creating EITHER a TcpChannel or an HttpChannel. However, if we need both channels, we’ll get a run-time exception creating the HttpChannel because the TcpChannel constructor modified my sink provider chain (appended a TcpClientTransportSinkProvider). The TcpChannel documentation doesn’t indicate the IClientChannelSinkProvider argument will be modified. I actually had to decompile System.Runtime.Remoting.dll (or look at the SSCLI – I forget which) to fully understand the problem here. Had C# supported const, I would have written the first 2 lines as:

    IClientChannelSinkProvider newCcsp = new BinaryClientFormatterSinkProvider();
    newCcsp.Next = new MyCustomProvider();
    const IClientChannelSinkProvider ccsp = newCcsp;

    And then, if the framework made use of ‘const’, I would have gotten a compiler error passing a ‘const IClientChannelSinkProvider’ to the TcpChannel constructor, and realized the problem right away.

    This functionality could even be added to the language in a backwards compatible way using attributes on the method parameters. The compiler could default to not checking const, but allow checking to be enabled. Perhaps only calls to classes/assemblies marked with [UsesConst] would be checked, or there would be both a [const] and [notconst] attribute. This would allow a project to benefit internally from the functionality, without requiring the entire .NET framework be updated with ‘const’ attributes.

    I’ve always justified the C# team’s decision to not support const by telling myself that it would probably provoke confusion because some developers would assume it was runtime-verified and end up introducing security problems. However, I don’t really believe this argument is sufficient. Some developers are surprised that fully trusted code can use reflection to access private members. In reality if you’re writing a trusted library whose clients may be untrusted and hostile you’re going to need a much stronger understanding of .NET security to get it right and are unlikely to make such a simple mistake.

    I would take this whole argument even further and say that compilers and languages could be doing a lot more to support compile-time and run-time contract verification, and that it would improve our ability to write robust and reliable software.

    Am I missing something here Chris? Is there some other reason I haven’t thought of that the C# team decided not to support compile-time verified const method parameters?

    Sorry for the long post. As I said, this is a religious issue <grin>. Keep up the awesome work!

  19. Zohar says:

    >(IErrorInfo anybody?)
    No thanks , I’ve just eaten…

  20. Chris Brumme says:

    Rick,

    There actually is an unenforced notion of ‘const’ available for compilers to share. Look at IsConstModifier in Microsoft.VisualC.dll. MC++ uses this as you would expect. For example, if you dump System.EnterpriseServices.Thunk.dll you will see it applied to static fields, pointers, and other parts of signatures. The CLR only uses these custom signature modifiers for binding / overload purposes.

    I actually think there’s a real risk with providing features like this without CLR enforcement. Or — in the case of ‘initonly’, which is ‘readonly’ in C# — providing features where the CLR enforces something different from what a developer might expect. During our V1 security push, we found many many examples where our framework developers had misinterpreted a C# compiler enforcement for a CLR enforcement and had erroneously built those assumptions into their security model. In some cases, we scrambled to add CLR enforcement at the end of V1. In other cases, we used FxCop and detailed source audits to remove those erroneous assumptions.

    For me, this was a real eye opener. The FX developers are very sophisticated. But it’s just too easy to confuse language model restrictions with execution model restrictions.

  21. Mark Morrell says:

    Chris –

    I’m completely changing the subject on you. You started this article with:

    —-8<—-

    I had hoped this article would be on changes to the next version of the CLR which allow it to be hosted inside SQL Server and other “challenging” environments. This is more generally interesting than you might think, because it creates an opportunity for other processes (i.e. your processes) to host the CLR with a similar level of integration and control.

    —-8<—-

    I’m about to embark on a significant development project, and right now am considering C#. Do you have any sort of criteria to decide what sorts of project should or should not run in the CLR? For instance, I would not expect MS Word, Excel, etc. to be written in C#. How do you decide?

    Also, do you know whether Microsoft is going to continue the older C++ environment? Assuming Microsoft continues to write non .NET software, will the older development platform continue to evolve and be supported?

    (And I’ll throw my vote in with the others. Keep ’em long. Cutting them shorter would only detract from their value.)

  22. Mark Hurd says:

    Mark Morrell: If I were starting Word or Excel from scratch VB.NET would be my language of choice. Sure some specific issues (performance, interop, etc) may arise that causes some other .NET language to be used for parts but VB is still the best at doing visual things in Windows.

  23. Chris Brumme says:

    I’m highly biased here. Long term, I would like to see managed code as the basis of Visual Studio, Office, device drivers and even the OS kernel. That’s a very long term. You shouldn’t be making your current product plans based on what we hope eventually to support. It’s hard for me to comment specifically on your application without some details. If you want to email me with those details, feel free.

    As for the language choice, I’m biased there too. I personally like C# because I have a long background with C and C++. I sometimes get frustrated that C# won’t let me do some of the obscure things the CLR supports. But I think they struck an excellent balance between a clean powerful language that has most of the dangerous edges removed. If I had a background with VB instead of C, I suspect I would be telling you something different.

  24. Rick Byers says:

    Thanks for the response Chris, I see your point and the insight into the FX team is valuable. I probably underestimate how easy it would be to fall into that trap (need that ‘pit of success’). Although I still think that if const violation was just a compiler warning (instead of an error), it would be harder for developers to think it was runtime enforced.

    Perhaps we need some kind of contract meta-information (eg. specified in the XML comments of a method) that could be verified outside the compiler (perhaps with a tool like FxCop). It just seems like a wasted opportunity to me to have significant aspects of a types contract specified only in non-machine readable form (i.e. in the documentation), where is can’t be verified automatically. This applies to much more than just const (my personal favourite topic is concurrency – its way too easy to have deadlock potential in a large concurrent C#/VB/C++/Java program).

  25. Chris Brumme says:

    I’m a huge believer in developer-authored statements about the semantics of their programs, particularly if there can be some level of automated checking of these statements. I agree with you that concurrency is a natural area to apply this sort of thing. Personally, I would like to see Eiffel-style pre-, post-conditions and invariants which can be selectively enabled. In the case of concurrency, locks could be ranked to avoid deadlocks. Recursion could be prevented with extra checks (on most locks).

    We’ve been looking at other places we could apply this sort of mechanism. I think it’s a ripe area for innovation both inside and outside Microsoft. But it’s important to avoid any confusion over whether these statements are runtime-enforced and therefore whether they can be a solid basis for building security.

  26. Chris Dern says:

    Are you comming ( here ) to the PDC? Ask the Experts?

  27. David Levine says:

    I don’t know if you still monitor these old sites once you generate a new blog but I’ll take a chance and hope you do and feel like answering one simple question.

    I was just wondering how the runtime handled simple try-finally semantics (no catch block). I know that the underlying OS traps an exception, propagates it to user level via the debugger ports (1st chance exception), to the app, etc. And then MSVC exposes try-finally and try-except semantics, and then the runtime maps those into the C# exception mechanism.

    I ask because I’ve seen a lot of references in documents, including here, about how expensive exception handling is in terms of performance and that the ratio of try-finally blocks to try-catch block should be on the order of 10:1.

    I would expect it would be easier (and far more consistent) to use the same mechanism for implementing try-finally semantics as is done for try-catch, but since this would involve round-tripping through the kernel (which would mean that all finallys are non-local) I thought that perhaps the runtime handled it differently, perhaps by directly running the chain of handlers via manipulating the FS:[0] register and stepping through the EXCEPTION_REGISTRATION records.

    If the runtime does use the OS to unwind the stack then where is the performance gain? It would still run the chain of frame-based handlers (twice – the second for finally semantics); the only gain I can think of is that the runtime would not need to generate a stack trace.

    Are there games played where if the execution stream reaches a known "good" point it knows it can execute the finally block and then JMPs directly to that code?

    What am I missing here? Thanks.

    Dave

  28. Chris Brumme says:

    I think your question is about try/finally in a stack fragment that’s made up entirely of managed activation records. In that case, there’s only a single SEH handler which guards all the managed code. So we participate with the OS SEH mechanism in order to get first and second pass notifications to that single SEH handler. But then we do our own thing (on X86) while distributing the notification to the managed exception constructs within that range of managed code.

    Having said that, even in managed code on X86 where we can avoid the burden of SEH on every method, the cost of processing an exception remains high. I suspect that at some point we will do some work to make it faster. We have a number of ideas here. Some are incremental and some are fairly radical. But at the boundaries with unmanaged code, we are forced to participate in a very specific manner that is dictated by SEH.

    Regardless, the long term trend is for exception handling to become slower — in relative terms — than it is now. In other words, computers will get faster at doing local operations and relative to this they will become slower at doing non-local operations. Exception handling will always involve non-local processing.

  29. David Levine says:

    Thanks Chris, and you are correct that I was referring to managed stack fragments.

    If you ever get the desire to revisit this subject in a future blog I think an interesting topic would be on how the new 64 bit architectures will affect exception processing within the CLR. For example, you mentioned how it performs perfect stack unwinding without getting into the mechanisms. Would this be compatible with existing mechanims or would interop become more proplematic then it is now? And, getting back to the original question, how would this impact performance?

    Dave
    PS: Keep the long blogs.

  30. Wallym says:

    You are the exception man in my book.

    Wally

  31. Nicu Georgian Fruja says:

    Dear Chris, you did an excelent blog about the CLR Exception model! I found in your material so many useful details.

    I would still have two questions concerning the mechanism of the two passes. If I got your explanations right, you claimed that during the first pass, no frames are popped off from the stack. On the other hand, I’ve read in the ECMA 335 standard for CLR that, there are frames which are popped off also in the first pass: I suppose this happens when the calling position is not embedded within a protected block (try).

    Also, you stated that, in the first pass, all the handlers are executed on the faulting exception frame but still pointing to the corresponding frame. Does this mean, in particular, that every handler is actually executed on the operand stack of the method it belongs? I suppose it should be so: one reason could be the maxstack of each operand stack.

    Once again, many thanks for your great article.

  32. Chris Brumme says:

    I just glanced through the ECMA spec. I see where it states that the exception object is popped off the stack when various exception handlers complete execution. And it’s certainly the case that the frames for the filters themselves will be popped off the stack during first pass evaluation, as each filter completes execution. But no other frames should be popped during the first pass.

    You are correct that first pass handlers are pushed over the faulting frame, but they have the context of the method that defines the filter. In unmanaged SEH, this is often achieved by having ESP point to the top of the stack (where it belongs), but EBP points back to the stack frame that defined the filter. This use of EBP can bring all the locals of that containing method back into scope for the filter code to access.

  33. Thank you very much for the prompt answer.

    I found in ECMA (Partition II) the following: “Stack frames are discarded either as this second walk occurs or after the handler completes, depending on information in the exception handler array entry associated with the handling block.”

    You said something about the frames of the filters, but in ECMA, there is nothing about non-function frames (for the case of the filters) and consequently, I don’t see why the above sentence in ECMA will refer to them.

    When I’ve read the first time in ECMA, my understanding was the following: in the first pass, are discarded ONLY the stack frames of the methods with filters whose execution is complete, which are not embedded in protected blocks with finally/fault handlers and which do not want to handle the exception.

  34. Chris Brumme says:

    Nicu, I just re-read section 12.2.4.5 of the ECMA spec. It’s hard to put algorithms into words, but I think that section does a decent job. To me, it seems clear that the frames are discarded during the second pass. The filters cannot be discarded as they are executed, because they are buried on the stack underneath finally blocks.

    I suppose an implementation that uses linked frames in the heap could selectively unlink and discard filter blocks. But that approach was never part of our design.

  35. Both Word and Excel OMs have APIs that allow closing documents programmatically. I suppose there are…

  36. Both Word and Excel OMs have APIs that allow closing documents programmatically. I suppose there are…

  37. Jeff Stong says:

    Tim, a fellow Compuware blogger, forwarded me a link to this Microsoft blog entry: Are you aware that…

  38. This may seem like a preposterous statement, but unfortunately it’s all too common.

    In my work I go…

  39. http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx

    저도 아직 안 읽어 봤는데, 같이 일하는 팀 멤버가 읽어 보라고 해서…

  40. I’ve seen a couple of posts concerning the evils that VB.NET has concerning the following: Try Catch

  41. In-depth Articles- Matt Peitrek on the internals of SEH . Matt Pietrek on Vectored Exception Handling

  42. Since I started monitoring traffic on this blog a little more closely about a week ago, I had the unexpected

  43. CoqBlog says:

    Si vous utilisez un debugger comme windbg sur des applications managées, vous aurez peut être déjà vu

  44. This may seem like a preposterous statement, but unfortunately it’s all too common. In my work I go through

  45. Introduction This was originally intended to be a post on identifying and troubleshooting Exceptions

  46. Introduction This was originally intended to be a post on identifying and troubleshooting Exceptions

  47. Rick Byers says:

    Often, when an unexpected exception occurs in production code, applications want to generate (and potentially

  48. Three Reasons Not to use Exceptions for Model Validation

  49. System.Threading.ThreadAbortException is just plain weird. For instance, most exceptions happen because