The Anatomy of a Late Bound Expression

When I first arrived at Microsoft to work on the Visual Basic team, I had no idea what Late Binding was. My manager at the time explained it to me: "Late Binding is all about figuring out which methods to call while the program runs. It's complicated. You're going to work on something else." This spawned more questions. How is it done? What happens behind the scenes? What's involved? For awhile, Late Binding remained a black box to me, but eventually I learned the answers to these questions.

 

The process of figuring out what methods to call and fields to use is called "member lookup and resolution", or "binding". Most of the time, the compiler binds method calls during compilation, a process we call Early Binding (well, some of us do). However, if your program uses types that aren't known during compilation, binding is deferred and instead performed while the program runs. This process is called Late Binding and is best described with an example:

 

    Class Queen

        Sub Move(ByRef x As String, ByVal y As String)

            ...

        End Sub

 

        Sub Replay(ByRef x As String, ByVal y As Date, ByVal z As Integer)

            ...

        End Sub

    End Class

 

    Class Knight

        Sub ResetPosition()

            ...

        End Sub

    End Class

 

    Sub Main()

        Dim a As String = ...

        Dim b As Integer = ...

        Dim x As Queen = ...

        Dim o As Object = ...

 

        x.Move(a, b)

        o.Move(a, b)

    End Sub

 

The expression x.Move(a, b) is a call to the method Move defined on Queen (determined from the type of variable x). The compiler, via the process of Early Binding, figures out that the text "Move" refers to method Move on type Queen, taking two String parameters. The compiler successfully analyzes this statement (i.e., binds the call) and moves to the next line.

 

The expression o.Move(a,b) is also a call to method Move, but on which type? The variable o is Object and can hold anything depending on the program's run-time behavior. Sometimes it might hold instances of Queen objects; sometimes Knight objects (or anything else for that matter). This is where Late Binding comes into action.

 

The compiler always tries Early Binding first. Move isn't a member of System.Object, so Early Binding fails. Normally, compilation stops here and a compile error results. However, because o is Object, because it can hold anything, the compiler defers binding and turns this expression into a Late Bound expression. The deferral is made by changing o.Move(a, b) into a helper function call. This helper function, which lives in the Late Binder (which itself is found in Microsoft.VisualBasic.dll), looks something like this:

 

    Public Sub LateCall( _

        ByVal Instance As Object, _

        ByVal MethodName As String, _

        ByVal Arguments() As Object, _

        ByVal ArgumentNames() As String, _

        ByVal CopyBack() As Boolean)

 

        'implementation

    End Sub

 

The corresponding VB code generated for o.Move(a, b) looks something like:

 

        Dim _args As Object() = {a, b} 'temporary variable

        Dim _copyback As Boolean() = {True, True} 'temporary variable

        Microsoft.VisualBasic.CompilerServices.LateCall(o, "Move", _args, Nothing, _copyback)

 

The method we're trying to call is Move. At compilation time we know only its name, so the MethodName parameter is the string "Move". The Instance parameter is o, the object on which we want to invoke Move. And the call has two arguments, a and b, so the compiler packages them into an array and passes them to the Arguments parameter. (For now, we will skip discussion of the ArgumentNames and CopyBack parameters.)

 

When the program runs and the line o.Move(a, b) is reached, the call into the Late Binder is made. Because the object sitting in o has a definite type, the Late Binder has all the information it needs to bind the call to Move. It performs the full member lookup and resolution process, a process which in many ways is analogous to the Early Binding done by the VB compiler. For example, o might be an object of type Queen, so the Late Binder figures out that the string "Move" refers to method Move on type Queen, taking two String parameters. Success! The method Queen.Move is invoked and the program continues running. However, o might be an object of type Knight. Knight has no method Move, so in this case, member lookup and resolution fails and the Late Binder throws a MissingMemberException. This is the equivalent of getting the compile error "'Move' is not a member of 'Knight'" were the variable o typed as Knight instead of Object.

 

Now for some details that my manager, back in the day, was eluding to: ByRef parameters cause trouble. Because a Late Bound expression such as Move is working with copies of a and b (remember the Arguments parameter), we have to take special care to copy the values back out if any match a ByRef parameter. This is done by using the CopyBack parameter and some conditional statements. The CopyBack parameter is used by the Late Binder to communicate back to the call site which arguments ended up matching ByRef parameters. After the call to LateCall is completed, the Boolean values stored in the CopyBack parameter are checked and, if True, the values are copied-out. This means that along with the helper call, a Late Bound expression also consists of several If..Then statements that check the CopyBack parameter:

 

        Dim _args As Object() = {a, b} 'temporary variable

        Dim _copyback As Boolean() = {True, True} 'temporary variable

        Microsoft.VisualBasic.CompilerServices.LateCall(o, "Move", _args, Nothing, _copyback)

        If _copyback(0) Then

            a = _args(0)

        End If

        If _copyback(1) Then

            b = _args(0)

        End If

 

Each True value in the CopyBack array means that the corresponding argument in the Arguments array matched a ByRef parameter and potentially changed during the call invocation. For example, since the zeroth parameter of Queen.Move is ByRef, the zeroth value in CopyBack will be set to True by the Late Binder, thus causing a to be assigned the new value.

 

But why initialize the CopyBack array with True values? Because there's no point checking and copying back values when the original argument isn't a storage location. To communicate this information to the Late Binder, the compiler initializes the CopyBack array with True values for each argument that comes from a variable, field, array, etc. If the argument isn't a storage location, such as a constant, function, ReadOnly property, etc., the compiler will specify False and omit the If..Then check corresponding to that argument:

 

        o.Move(10, b)

 

becomes:

 

        Dim _args As Object() = {10, b}

        Dim _copyback As Boolean() = {False, True}

        Microsoft.VisualBasic.CompilerServices.LateCall(o, "Move", _args, Nothing, _copyback)

        If _copyback(1) Then

            b = _args(0)

        End If

 

There are even more compliations to consider when named arguments have been specified by the user. For example:

 

        o.Replay(GetA(), z:=TimeOfDay(), y:=GetB())

 

Because named arguments affect binding, the Late Binder needs to know which names were specified and for which arguments. This information is communicated using the ArgumentNames string array. For example, the ArgumentNames array for the expression above would be {"z", "y"} . VB rule: once a named argument is specified in the argument list, all subsequent arguments must be named. Naturally, this requirement would force all argument values into the end of the Arguments array. However, for simplicity, we would like the same index into the Arguments array and ArgumentNames array to refer to matched value-name pairs. This means the compiler must rearrange the argument values into the beginning of the Arguments array, which would look like {TimeOfDay(), GetB(), GetA()} . Yet this tweak contains a hidden complication: order of evaluation. The evaluation of the argument list should occur lexically, left-to-right, where GetA is invoked before TimeOfDay which is itself invoked before GetB. If the Arguments array were initialized starting from the zeroth index, TimeOfDay and GetB would be invoked before GetA! This could cause serious trouble if these functions had side effects. By initializing the Arguments array in lexical order, the order of evaluation is preserved:

 

        Dim _args As Object() = New Object(2) {}

        _args(2) = GetA()

        _args(0) = TimeOfDay()

        _args(1) = GetB()

        Dim _argnames As String() = {"z", "y"}

        Dim _copyback As Boolean() = {True, False, False}

        Microsoft.VisualBasic.CompilerServices.LateCall(o, "Replay", _args, _argnames, _copyback)

        If _copyback(0) Then

            TimeOfDay() = _args(0)

        End If

 

(note that TimeOfDay is a read/write property and can be changed if it matches a ByRef parameter, thus the value True in the CopyBack array).

 

So far, I have discussed the simplest Late Binding scenario. Matters complicate yet further in other scenarios, where the call o.Move occurs on either side of an assignment:

 

        o.Move(a, b) = c

 

In scenarios such as these, c must be packaged as another parameter and the whole expression evaluated as a possible Property or Field assignment (with a potential intermediate Default Property access). In fact, each of the following forms represents a unique Late Binding scenario:

 

        o.Move(a, b)

        o.Move(a, b) = c

        c = o.Move(a, b)

        o(a, b) = c

        c = o(a, b)

 

Only the first statement corresponds to the LateCall helper we've been analyzing. The other scenarios each have their own helper with various additional arguments to control the unique semantic differences that arise.

 

I want to briefly describe what the LateCall helper actually does. The following is a very rough implementation and demonstrates how the VB Late Binder interacts with System.Reflection:

 

    Public Sub LateCall( _

        ByVal Instance As Object, _

        ByVal MethodName As String, _

        ByVal Arguments() As Object, _

        ByVal ArgumentNames() As String, _

        ByVal CopyBack() As Boolean)

 

        Dim T As Type = Instance.GetType

        Dim Members As Reflection.MemberInfo() = T.GetMember(MethodName)

 

        Dim Result As Reflection.MemberInfo = _

            PerformMemberResolution(Members, Arguments, ArgumentNames)

 

        Select Case Result.MemberType

            Case MemberTypes.Method

                Dim MethodResult As MethodInfo = DirectCast(Result, MethodInfo)

 

                MethodResult.Invoke(Instance, Arguments)

 

                For Each P As ParameterInfo In MethodResult.GetParameters

                    If P.ParameterType.IsByRef Then

                        CopyBack(index_of_P) = True

                    End If

                Next

 

            Case ...

        End Select

    End Sub

 

First, Reflection is used to fetch all the members matching MethodName into an array. This array, along with Arguments and ArgumentNames, is passed to a function PerformMemberResolution. This function is responsible for implementing all of Visual Basic's binding semantics, including name shadowing, method overload resolution, named argument matching, and various other checks. However, selecting the method is only half the work. Once a selection is made, the member must be invoked. In the case of methods, this is done via the Reflection.MethodInfo.Invoke member. Once execution of the method is complete, the CopyBack array is populated with the correct values and the Late Bound expression is complete.

 

I should have listened to my manager. :)