I was tasked with understanding and fixing a bug on error reporting with foreach iteration variables the other day, and it got me thinking about local variable scoping rules in C# in general. First, the bug.
Consider the following code:
static void Main(string args)
foreach (int myCollection in myCollection)
// Your code here
This code should clearly not compile, because myCollection is used before it is declared. But it does! On VS2008 Beta2, this code currently compiles, and on runtime, produces a TypeLoadException.
Well, first let us consider how a foreach statement is expanded. According to the C# language specification section 8.8.4, the foreach statement is expanded as follows:
E enumerator = (collection).GetEnumerator();
ElementType element = (ElementType)enumerator.Current;
IDisposable disposable = enumerator as System.IDisposable;
if (disposable != null) disposable.Dispose();
Notice that the call to "collection.GetEnumerator" happens outside of the try scope and the while scope, but the definition of the iteration variable "element" occurs inside the while scope. The bug was that when we attempted to bind the collection to get the GetEnumerator call off of it, we bound it inside the while scope instead of outside the try scope. As such, when we asked the local symbol table to resolve the name "myCollection", it returned us the iteration variable declared inside the while scope. This caused the compiler to incorrectly accept this program and not produce an error when compiling it. When one tries to run the program however, the CLR detects the problem with the type, and consequently throws the TypeLoadException, as expected.
The fix was simply to move the binding of the collection outside the try scope, and correctly report that "myCollection" is not defined in the outer scope.
That got me thinking about scoping rules in general. Where do we introduce local variables and scopes that may not be intuitive to the user? What exactly are our local variable scoping rules to begin with?
I'll deal with the latter (important!) question in a subsequent post.
In addition to variable declaration statements, the language provides four other mechanisms to declare local variables.
- Foreach iteration variables
- Lambda parameters/Anonymous method parameters
- Catch exception variables
- Using statement variables
For those interested, I'll briefly describe how the remaining three mechanisms declare their locals, and how they are scoped.
Lambda parameters work as one would expect - the parameters are declared as local variables inside the scope of the body of the lambda. From section 7.14.1 of the C# language specification:
The optional anonymous-function-signature of an anonymous function defines the names and optionally the types of the formal parameters for the anonymous function. The scope of the parameters of the anonymous function is the anonymous-function-body. (§3.7) Together with the parameter list (if given) the anonymous-method-body constitutes a declaration space (§3.3). It is thus a compile-time error for the name of a parameter of the anonymous function to match the name of a local variable, local constant or parameter whose scope includes the anonymous-method-expression or lambda-expression.
In essence, the last sentence in that statement says that you cannot declare a parameter of a lambda or anonymous method (I'll refer to the two simply as the lambda) which has the same name as any local variable in the scope of the lambda's declaration. Why? Because it would give a different meaning to the name inside the body of the lambda. I'll elaborate on this more in my next post.
Catch blocks that declare local variables are scoped for the lifetime of the catch block:
When a catch clause specifies both a class-type and an identifier, an exception variable of the given name and type is declared. The exception variable corresponds to a local variable with a scope that extends over the catch block.
Any exception variables declared in the catch block must have a type that is System.Exception, is derived from System.Exception, or is a type parameter that has System.Exception (or a subclass thereof) as its effective base class.
Using statements work pretty much as expected as well:
ResourceType resource = expression;
if (resource != null) ((IDisposable)resource).Dispose();
Note that for using statements, the local variable that is declared for the resource acquisition is read-only. It is a compile time error to attempt to modify any local variables declared in this manner. Note also that a using statement that acquires more than one resource is really syntactic sugar for nested using statements, and is bound as a series of nested try blocks.
Any resource acquisition variables must be of a type that implements System.IDisposable.