Pop quiz: Who wins in finally vs. return?:

Question 1) What's the return value from this C# function:

 
        static int Test()
        {
            int val = 1;
            try
            {
                return val;
            }
            finally
            {
                val = 2;
            }
        }

Question 2)  What about this (using a static instead of  local)

 
        static int s_val;
        static int Test()
        {
            s_val = 1;
            try
            {
                return s_val;
            }
            finally
            {
                s_val = 2;
            }
        }

The real lesson here is: never write code like this!
Even if you have the C# spec memorized, chances are, the next person down the line that maintains this code won't and it will just confuse everybody.

 

 

The answer...

The IL gives it away. Here's the IL for question #1:

 .method private hidebysig static int32  Test() cil managed
{
  // Code size       16 (0x10)
  .maxstack  1
  .locals init ([0] int32 val,
           [1] int32 CS$1$0000 // $ret
)
//000010:         {
  IL_0000:  nop
//000011:             int val = 1;
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0
//000012:             try
//000013:             {
  .try
  {
    IL_0003:  nop
//000014:                 return val;
    IL_0004:  ldloc.0 // val
    IL_0005:  stloc.1 // $ret
    IL_0006:  leave.s    IL_000d
//000015:             }
//000016:             finally
//000017:             {
  }  // end .try
  finally
  {
    IL_0008:  nop
//000018:                 val = 2;
    IL_0009:  ldc.i4.2
    IL_000a:  stloc.0 // val
//000019:             }
    IL_000b:  nop
    IL_000c:  endfinally
  }  // end handler
  IL_000d:  nop
//000020:         }
  IL_000e:  ldloc.1 // $ret
  IL_000f:  ret 
} // end of method Program::Test

Basically, the return value for a function is evaluated and cached at the time of the return statement.

The IL codegen demonstrates this: the local variable, val, is in IL local slot 0. The C# 'return' statement really  just a) copies the return value to a hidden local variable 1 ("CS$1$0000"), and b) jumps (via that leave.s instruction) to a real return instruction that's outside the finally block. The finally runs next and updates val in slot 0, but it leaves the return address in slot 1 untouched. So the finally here is a giant nop.  The function then returns using the cached return value.

So in question 1, the above C# snippet will return 1. 

Question 2 is actually the same as question #1. In both cases, the 'return' statement caches the return value into a hidden local. So in question #2, the function returns 1, but s_val is updated to 2. That may seem a little confusing, but it's the same evil as using a post-increment operator like { return i++ }.

My thoughts:
I think C#'s semantics are the only viable answer for a non-dynamic language. The only potential rival design I see would be having the return expression be evaluated after the finally (instead of at the time of the return statement). I think that sort of dynamic evaluation is very dangerous. Eg, imagine trying to do the codegen for:

 
        static int s_val1;
        static int s_val2;
        static int Test()
        {
            s_val1 = 1;
            try
            {
                if (Something())
                {
                    return s_val1;
                }
                else
                {
                    return s_val2;
                }
            }
            finally
            {
                s_val1 = 2;
            }
        }

I think C#'s semantics are the best because:

  1. If the finally was allowed to "intercept" the return value, that would make things very unpredictable. Put in some IL filters and things get really fun.
  2. It also gives simpler semantics that are easy to describe: "the return value is evaluated at the time of the return statement"
  3. It gives predictable codegen.
  4. It's more consistent across a variety of constructs.

However, I wouldn't mind a few compiler warnings in this case. For example, a warning like "a local variable was modified in a finally block but nobody will see the affect" would be cool. Though hopefully since nobody writes code like this in the first place, such a warning would be a low priority.

Anyways, If you're interested in exploring alternative code-gen patterns, you can always Edit the IL and try yourself.