Dynamic Initialization of variables

 

Hi, I'm Chaitanya Raje and I am a developer on Compiler and Tools team for Windows Mobile and Windows Embedded CE. This is my first blog on msdn. I hope I will be able to share out some insights into new features and commonly known issues about using the compilers and related tools through my blogs.

I would like to start with a write-up on dynamic initialization of variables in C++. C++ (but not C) allows you to initialize global variables with non-constant initializers. For e.g.:

Foo.cpp

#include <stdio.h>

int alpha(void)

{

    return 20;

}

int i = alpha(); //dynamic intialization

int main()

{

    printf("i = %d",i);

    return i;

}

According to the C/C++ standards global variables should be initialized before entering main(). In the above program, variable 'i' should be initialized by return value of function alpha(). Since the return value is not known until the program is actually executed, this is called dynamic initialization of variable.

The Problem:

Let us compile the above program and link with entrypoint ‘main’.

cl Foo.cpp /link /entry:main

Here’s your output when you run the exe –

i = 0

Surprised? We all expected the output to be -”i = 20”. Let us try to understand why we got an unexpected output.

The Theory:

The global ‘i’ has a dynamic initializer, so its value is not initialized until the program is executed. Since we linked the exe with entrypoint as’ main’, the C Runtime started executing ‘main()’ as the first function in your program. ‘alpha()’ was never invoked and ‘i’ was never initialized, hence the unexpected output.

Now the question is how do we invoke these dynamic initializers before ‘main()’ and still keep the entry point of our program as ‘main()’?

The Solution:

The answer lies in C Runtime's startup routines. C Runtime (CRT) defines different startup routines corresponding to your standard entry points as follows –

Your entrypoint

CRT entrypoint

main

mainCRTStartup

wmain

wmainCRTStartup

WinMain

WinMainCRTStartup

wWinMain

wWinMainCRTStartup

DllMain

_DllMainCRTStartup

The above CRT startup routines are designed to invoke dynamic initializers in your program to initialize the global variables and then call the corresponding standard entry point. So, if your program uses dynamic initializers, you should set your entry point to one of the CRT startup routines (corresponding to your real entry point from the table above) while linking. Not using the CRT startup routine as an entrypoint (and using a standard entrypoint instead) will keep the global variables that need dynamic initialization, uninitialized.

Now let’s compile and link Foo.cpp with CRT entrypoint –

cl Foo.cpp /link /entry:mainCRTStartup

Here’s your output as expected –

i = 20

NOTE: The above program will generate a compiler error if compiled as a C program (instead of C++) because dynamic initializers are not allowed by C language.

Here are a few more examples of dynamic initializers-

1.

class B {

public:

    int i;

    B() {

        i=10;

    }

    ~B() {};

}

B b; //requires dynamic initializer to call constructor B().

A global object is a classic example of dynamic initializer. The constructor on a global object needs to be invoked before we enter main.

2.

extern char ValueKnown[];

char* Name1 = ValueKnown; //statically initialized with &ValueKnown[0]

#if defined(__cplusplus)

    extern char* ValueUnknown;

    char* Name2 = ValueUnknown; // requires dynamic initializer

#endif

ValueKnown and ValueUnknown, though they look very similar, there’s a very subtle difference between them. ValueKnown is a statically initialized array and hence its value (and location) is guaranteed to be known while linking with (and in the .data section of) module in which it is defined. ValueUnknown on the other hand is a char pointer variable whose value may or may not be known at compile-time or during linking with module that defines it. It could be pointing to a constant string or it could have a dynamic initializer itself (in module defining it). This makes the compiler generate a dynamic initializer for variable Name2.

More details:

Some of you might be curious to know how CRT finds information about dynamic initializers. The compiler actually sets up things for the CRT. It creates a section named .CRT$XCU in your object file with useful information for the CRT. This section is essentially a list of function pointers or pointers to class constructors which are dynamic initializers for your program. The CRT just loops through this list and invokes them as it goes along. The compiler generates an entry into this section every time it finds a dynamic initializer in your code.

The section name is .CRT and XCU is name of the group.

The CRT also defines 2 pointers

- __xc_a in section .CRT$XCA

- __xc_z in section .CRT$XCZ

The linker then merges all .CRT groups into one section and orders them alphabetically by group name. This causes the pointers to be laid out as follows -

.CRT$XCA

            __xc_a

.CRT$XCU

            Pointer to Global Initializer 1

            Pointer to Global Initializer 2

.CRT$XCZ

            __xc_z

__xc_a and __xc_z thus act as demarcations for start and end of dynamic initializer list. CRT can now loop through this list at the startup. Note that order of initialization across modules is neither defined nor easily predictable.

I hope this has given you some insight into the C Runtime's initialization mechanism, but the real point I wanted to convey from this blog is - try to use CRT entrypoints instead of the standard main/Winmain to avoid surprises in your output.

If you have any question or comments regarding this topic, please let us know. We'll be more than happy to answer them! If you would like us to write on any particular topic related to compilers and related tools like linker, runtime libraries, etc. we are open to recommendations.

Thanks.

Chaitanya Raje

on behalf of Windows Devices Compiler Team