Anatomy of STL vector: Module Size

If you need a dynamic array in C++, a widely used class is the vector template class in STL. There are even books recommending replacement of plain C++ array with STL vector.

 

This series is going to look at how STL vector is implemented and what is the cost associated with using STL vector.

 

Let’s start with an empty console program:

 

#include <stdio.h>

#include <tchar.h>

int _tmain(int argc, _TCHAR* argv[])

{

  int sum = 1;

   

    return sum;

}

When compiled in release mode (full optimization, favor small code, multi-threaded runtime), it generates an EXE with 49,152 bytes. Here is a break down of various sections in the EXE.

Code .text section 24,920 bytes

Read-only data .rdata section 6,892 bytes

Read-write data .data section 6,524 bytes

Resource .rsrc section 176 bytes

 

If we add a vector of integer, the code becomes:

 

#include <stdio.h>

#include <tchar.h>

#include <vector>

int _tmain(int argc, _TCHAR* argv[])

{

  int sum = 1;

   

    std::vector<int> intvector;

    intvector.push_back(3);

    sum += intvector[0];

    intvector.push_back(5);

    sum += intvector[1];

    intvector.push_back(7);

    sum += intvector[2];

   

    return sum;

}

 

When the introduction of STL, we have to enable C++ exception handling in our code (/EHsc compile option is recommended). To support vector, STL also brings other classes like STL string, allocator, etc. Now the EXE is 61,400 bytes with the following breakdown:

 

Code .text section 33,375 bytes

Read-only data .rdata section 10,450 bytes

Read-write data .data section 6,880 bytes

Resource .rsrc section 176 bytes

 

STL vector is defined as a template class, so adding new instances of it may duplicate code unless compiler/linker the new instance has the same characteristics of the old ones so that the compiler/linker can merge multiple instances together. The most important factor is the size of element you put into a vector. Let’s try adding a vector of difference size, 16-bit integer.

 

#include <stdio.h>

#include <tchar.h>

#include <vector>

int _tmain(int argc, _TCHAR* argv[])

{

  int sum = 1;

   

    std::vector<int> intvector;

    intvector.push_back(3);

    sum += intvector[0];

    intvector.push_back(5);

    sum += intvector[1];

    intvector.push_back(7);

    sum += intvector[2];

   

    std::vector<short> shortvector;

    shortvector.push_back(3);

    sum += shortvector[0];

    shortvector.push_back(5);

    sum += shortvector[1];

    shortvector.push_back(7);

    sum += shortvector[2];

    return sum;

}

 

When you compile this third version, the EXE generated is still 61,440 bytes. But two sections within it have actually grown. It’s just that sessions are normally allocated on 4 kilo-bytes blocks, so small growth may not be visible in final EXE size. Here is the new break-down.

 

Code .text section 34,366 bytes

Read-only data .rdata section 10,458 bytes

Read-write data .data section 6,880 bytes

Resource .rsrc section 176 bytes

 

So adding vector<short> adds 991 bytes to code section and 8 bytes to read-only data section. If you dig deep in PDB file generated, you will see the last version adds the following into the code section (total 769 bytes):

Tag Size Name

func 86 void vector<short>::push_back(short const &)

func 125 class _Vector_iterator<short> std::vector<short>::insert

(class std::_Vector_iterator<short>,short const &)

func 36 short * vector<short>::_Umove<short *>(short *,short *,short *)

func 415 void vector<short>::_Insert_n(_Vector_iterator<short>,unsigned int,short const &)

func 33 short * vector<short>::_Ufill(short *,unsigned int,short const &)

func 74 short * allocator<short>::allocate(unsigned int)

Constructor for the STL vector is inlined here, destructor is implemented as a function. But destructors for vector<int> and vector<short> are exactly the same in binary code, so they are merged into one. Push_back is implemented with a bunch of supporting functions, while accessing the vector itself is inlined. The rest of the increment in code section is due to added test case for vector<short>.

The 8-byte increment in read-only section is due to change in exception handling table for std::bad_alloc, caused by adding allocator<short>::allocate.

The complete list of functions for STL vector is quite long. Some of them may be implemented as out-of-line functions, some of them may be inlined. Some of them may be merged with another instantiation, some may not.

For the record, here is the complete list of functions for STL vector<short>:

void vector<short>(const class vector<short> &)

void vector<short>(unsigned int, const short &, const class allocator<short> &)

void vector<short>(unsigned int, const short &)

void vector<short>(unsigned int)

void vector<short>(const class allocator<short> &)

void vector<short>()

void _Construct_n(unsigned int, const short &)

void ~vector<short>()

class vector<short> & operator=(const class vector<short> &)

void reserve(unsigned int)

unsigned int capacity()

class _Vector_const_iterator<short> begin()

class _Vector_iterator<short> begin()

class _Vector_const_iterator<short> end()

class _Vector_iterator<short> end()

class reverse_iterator<_Vector_const_iterator<short> > rbegin()

class reverse_iterator<_Vector_iterator<short> > rbegin()

class reverse_iterator<_Vector_const_iterator<short> > rend()

class reverse_iterator<_Vector_iterator<short> > rend()

void resize(unsigned int, short)

void resize(unsigned int)

unsigned int size()

unsigned int max_size()

bool empty()

class allocator<short> get_allocator()

short & at(unsigned int)

const short & at(unsigned int)

short & operator[](unsigned int)

const short & operator[](unsigned int)

const short & front()

short & front()

const short & back()

short & back()

void push_back(const short &)

void pop_back()

void assign(unsigned int, const short &)

void insert(class _Vector_iterator<short>, unsigned int, const short &)

class _Vector_iterator<short> insert(class _Vector_iterator<short>, const short &)

class _Vector_iterator<short> erase(class _Vector_iterator<short>, class _Vector_iterator<short>)

class _Vector_iterator<short> erase(class _Vector_iterator<short>)

void clear()

void swap(class vector<short> &)

void _Assign_n(unsigned int, const short &)

bool _Buy(unsigned int)

void _Destroy(short *, short *)

void _Tidy()

void _Insert_n(class _Vector_iterator<short>, unsigned int, const short &)

short * _Ufill(short *, unsigned int, const short &)

static void _Xlen()

static void _Xran()

static void _Xinvarg()

void * __vecDelDtor(unsigned int)

Now we can have a rough ideal of the cost of using STL vector to module size, if you only use its push_back method and [] operator:

 

Using STL vector, if you just use push_back and [] only (N: number of unique instantiations):

Module Size Cost = Cost of dealing with C++ exception handling +

                   769 bytes * N + 7476 bytes in code section +

                   8 bytes * N + 3350 bytes in read-only data section +

                         356 bytes in read-write data section

 

Adding 11 kilobytes to get STL dymanic array support is not a big deal. But if your code is happily running without using C++ exception handling, or you're using lots of unique instantiations of STL vector template (at around 800 bytes each, if you just use push_back and []), then you should consider if the cost is too high for you.