Identifying the Candidate Function Set

Sorry it is taking me so long these days. I am in the throes of more formal writing – a book on our CLI binding for C++, and a series of articles for our Visual C++ MSDN website on STL.NET. And my translation tool is happily going through a formal test cycle – thank you Mitchell and Arjun – and so I've been fixing bugs and being a developer once more again. So the blog gets bogged down.

I thought I would follow up on the issue of overload function resolution. And so this entry discusses how a candidate function list is built up. (I realize this is somewhat esoteric, but. Well, I hope someone finds it worth a spin around the page.)

The first question of course, is, what the heck is a candidate function list? Literally, a candidate function list is the set of functions sharing the same name visible at a call point. As we'll see, the complexity, when present, has to do with determining the actual set of visible functions.

It's always good to begin with a simple example – hopefully, if we can master this one, our confidence will surge, and we can march on to the question of namespace, qualified type names in the signature of a function, and the using declaration in general. Here is our example,

void f(); // (1)

void f( String^ ); // (2)

void f( const string& ); // (3)

void f( String^, array<String^>^ ); // (4)

 

int main( array<String^> ^args )

{

    if ( args == nullptr )

         handle_invalid_command_line();

 

    for each ( String^ s in args )

        f( s );

}

There are four candidate functions to the call of f() within main(). By inspection, we can see that only two of them are viable – that is, only the (2) and (3) instances can match the actual invocation. (And (2) is the best viable function.)

So that was simple. Things start getting somewhat more complicated if the type of a function argument is declared within a namespace. Let's first look at a fully qualified name. In this case, the functions within the namespace that have the same name as the called function are added to the set of candidate func­tions. For example:

namespace CLITypes

{

public ref class C {…};

void takeC( C^ );

}

 

// …

 

void f( CLITypes::C^ cobj )

{

    // ok: calls CLITypes::takeC( C^ )

    takeC( cobj );

}

There is no takeC() function declared within the global scope in which f() is defined. There is, as well, no using declaration opening up a namespace. So, on first glance, this appears to be an illegal invocation: the candidate set appears to be empty.

However, because the argument is qualified to occur within the CLITypes namespace, the functions declared within that namespace are considered as well. That is, the full set of candidate functions under this circumstance represents the union of the functions visible at the point of call and the functions declared within the namespaces of the argument types.

There are three general cases in which functions with the same name do not overload – at least currently. (There is some activity within the ECMA committee that hasn't jelled as yet one way or another – at least as far as I'm aware.)

1. A derived class function that reuses the name of a base class virtual function overrides rather than overloads the base class instance. Except when the new slot modifier is applied, the derived class instance must conform to the signature of the base class function. It substitutes for the base class instance within the derived class virtual table.

2. A derived class function that reuses the name of a non-virtual base class function hides rather than overloads the base class instance. The signatures of the base and derived class instance are not considered. This is because the overload candidate set for a function does not extend across scope boundaries. (This is the guy currently under siege, I believe.)

3. A function declared within a local block hides rather than overloads all named instances of that function within the enclosing scopes for the extent of that block. This is the most esoteric of the three cases, so let me provide a quick example,

String^ Marshall( int ); // (1)

String^ g() {

{

    // these puppies hide global instance …

    String^ Marshall( double ); // (2)

    String^ Marshall( char* ); // (3)

 

    return Marshall( 1024 ); // resolves to (2)

}

In this example, the global instance of Marshall() is not visible within g(); the candidate functions are limited to the two declarations within g() itself. The char* instance is not a viable candidate function for an actual argument of 1024. This leaves us with instance (2) match the formal parameter of type double through a standard conversion although the global instance, if considered, represents an exact match.

The candidate functions also depend on the visibility of using declarations at the call point. This is because a using declaration opens up a namespace. For example,

namespace libs_R_us {

    int max( int, int );

    double max( double, double );

}

 

char max( char, char );

void func()

{

    // namespace functions not visible

    // the three calls resolve to global max( char, char )

 

    max( 4096, 8192 );

    max( 35.1, 35.9 );

    max( 'J', 'L' );

}

In this case, the only function visible is the function declared in global scope. It is therefore the only candidate function, and is the instance invoked by all three calls within func(). This results the loss of precision in both arithmetic invocations. We have two choices for correcting this, both of which make use of a using directive to open the namespace. The question is where we should place it.

One possibility is to place using declaration in global scope. For example,

char max( char, char );

using libs_R_us::max; // using declaration

All three instances of max() are now visible within the global scope and are placed in the set of candidate functions. The three invocations are now each an exact match to a separate instance, as follows,

void func()

{

    max( 4096, 8192 ); // libs_R_us::max(int,int);

    max( 35.1, 35.9); // libs_R_us::max(double,double);

    max( 'J', 'L' ); // ::max( char, char );

}

Alternatively, we might choose to place the using declaration within the local scope of func(). Why would we do that? Primarily to limit the extent of the changes in our program due to the larger set of candidate functions. By adding to the candidate function set at global scope within an existing program, we are potentially changing the function invoked at each call point that does not involve an exact match. This may be a more invasive change than what we are ready to support. The alternative declaration looks as follows,

void func()

{

    // local using declaration

    using libs_R_us::max;

 

    // same function calls as above

}

Surprisingly, we get a different the set of candidate functions now. This is because using declara­tions nest.[1] With the using declaration in local scope, the global function is now hidden. The only visible functions at the call points are the two declared within the namespace, and so our character comparison resolves to the namespace instance max(int,int) through a promotion of the two character arguments.

There are two possible solutions to getting the three functions into the candidate set. We originally choose a nested using declaration in order to localize the inclusion of the functions to just the call points within func(). One solution, of course, is to move it back to global scope. But this opens the entire assembly to potential change. Alternatively, we can add the global instance to our nested set of declarations,

void func()

{

    // now we have all three in the candidate set

    using libs_R_us::max;

    extern char max( char, char );

 

    // same function calls as above

}

The set of candidate functions are therefore the union of the functions visible at the point of the call — including the functions introduced by using declarations and using directives — and the member functions declared in the namespaces associated with the types of the arguments. For example,

namespace basicLib {

      void print( String^ );

      void print( Object^ );

}

 

namespace matrixLib {

    public ref class Matrix { /* ... */ };

    void print( Matrix^ );

}

 

void display()

{

    using basicLib::print;

    matrixLib::Matrix ^mObj;

 

    print( mObj ); // matrixLib::print( Matrix^ )

    print( "literal" ); // basicLib::print( String^ )

    print( 1024 ); // basicLib::print( Object^ )

}

Which functions are the candidate functions for the call print(mObj)? The two basicLib functions introduced by the local using declaration are candidate functions because they are visible at the point of the call. Because the function call argument is of type matrixLib::Matrix, the print() function declared within the namespace matrixLib is also a candidate func­tion.

Once the candidate functions are identified, the next step – which begins to involve type checking – is to determine the viable functions within the candidate set. That topic is a candidate for a subsequent blog.


[1] Using directives do not nest. That is, using libs_R_us makes the namespace members visible as if they were declared outside the namespace at the location where the namespace is defined. Therefore, whether the using directive is nested or global makes no difference.