The Alpha AXP, part 15: Variadic functions

As noted in the initial plunge, the first six integer parameters are passed in registers, and the first six floating point parameters are passed in a different set of registers. So how does the callee known at function entry which registers to spill, and in what order

Answer: It doesn't. So it just spills everything.

First, a detail on the calling convention: The first six parameters are passed in registers, and if you pass a parameter in an integer register, then the corresponding floating point register is unused, and vice versa. In other words:

• The first parameter is passed in either a0 or f16.
• The second parameter is passed in either a1 or f17.
• The sixth parameter is passed in either a5 or f21.

On entry to a variadic function, the function spills all the integer parameter registers onto the stack first, and then spills the floating point parameter registers onto the stack next. The result is a stack that looks like this:

 ⋮ param 10 param 9 param 8 param 7 ← stack pointer on function entry integer param 6 (a5) integer param 5 (a4) integer param 4 (a3) integer param 3 (a2) integer param 2 (a1) integer param 1 (a0) floating point param 6 (f21) floating point param 5 (f20) floating point param 4 (f19) floating point param 3 (f18) floating point param 2 (f17) floating point param 1 (f16) ← stack pointer after spilling local variable local variable local variable local variable ← stack pointer after prologue complete

The `va_list` type is a structure:

```typedef struct __va_list
{
char* base;
size_t offset;
} va_list;
```

The `va_start` macro initializes `base` to point to "integer param 1" and `offset` to 8 × the number of non-variadic parameters.

If you invoke the `va_arg` macro with a non-floating point type as the second parameter, then it operates in an unsurprising manner: It retrieves the data from `base + offset` and then increases the `offset` by the size of the data (rounded up to the nearest multiple of eight).

But invoking the `va_arg` macro with a floating point type as the second parameter is weirder: If the `offset` is less than 48, then it retrieves the data from `base + offset - 48`, resulting in a "reach-back" into the parallel array of spilled floating point registers. If the `offset` is greater than or equal to 48, then it retrieves the data from `base + offset` as usual. Regardless of where the data is read from, the `offset` increases by the size of the data (rounded up to the nearest multiple of eight).

The implementations of the `va_start` and `va_arg` macros take advantage of special-purpose compiler intrinsics that did a lot of the magic.

There are a few optimizations possible here. For one thing, the compiler doesn't need to spill non-variadic parameters, though it does need to reserve space for them on the stack so that the `va_arg` macro continues to work.² Furthermore, if the compiler can observe that `va_arg` is never invoked with a floating point type, then it doesn't need to spill the floating point registers at all. (Similarly, if `va_arg` is always invoked with floating point types, then the integer registers don't need to be spilled.)

I don't remember whether the Microsoft compiler actually implemented any of these optimizations.

¹ It turns out that this question is not Alpha-specific. It applies to any architecture that passes variadic parameters differently depending on their type.

² If the compiler can observe that `va_arg` is never invoked with a floating point type, then it doesn't even need to reserve space for the non-variadic parameters. It can just point the `base` at where the first integer parameter would have been, even though it now points into the local variables. Those local variables will never be read as parameters because the initial `offset` skips over them.

Tags