Here are some of the integer computational operations available on the Alpha AXP. I'm going to cover only the instructions used in general-purpose programming, since that's the sort of thing you're most likely to encounter when debugging everyday application code. In particular, I'm not going to cover the various multimedia instructions.

Integer arithmetic operations come in two flavors, one that operates on the full 64-bit register, and another that operates on the least significant 32 bits of the value. As I noted earlier, the general rule in the Alpha AXP is that if the result of an operation is a 32-bit value and the destination is a register, then the value is sign-extended to a 64-bit value. This means that if you use the 32-bit versions of these instructions, the results will be sign-extended to 64-bit values.

The general notation for calculations is to provide the source operands first, and the destination operand last.

ADDQ Ra, Rb, Rc ; Rc = Ra + Rb ADDQ Ra, #b, Rc ; Rc = Ra + b ADDL Ra, Rb, Rc ; Rc = (int64_t)((int32_t)Ra + (int32_t)Rb) ADDL Ra, #b, Rc ; Rc = (int64_t)((int32_t)Ra + b)

The `ADD`

instruction has four variants.
The 64-bit versions add the two source values and puts the
result in the destination `Rc` register.
The 32-bit versions add the least significant 32-bit values
in the source registers,
calculates a 32-bit result,
and then sign extends that result to a 64-bit value,
putting the final result in the `Rc` register.
You can add two registers, or you can add a register and a small constant
in the range 0 to 255.

In the future, I'm going to write

to mean "__x__`L`

or `Q`

",
and `Rb/#b`

to mean "a register (`Rb`)
or a small constant in the range 0 to 255."

SUBxRa, Rb/#b, Rc ; Rc = Ra - Rb MULxRa, Rb/#b, Rc ; Rc = Ra * Rb UMULH Ra, Rb/#b, Rc ; Rc = (Ra *U Rb) >> 64

The `SUB`

instructions perform subtraction,
and the `MUL`

instructions perform multiplication.
The `UMULH`

instruction performs a 64×64
unsigned multiplication,
and stores the high 64 bits of the 128-bit intermediate result.
(If you want the low 64 bits, then use the regular `MULQ`

instruction.)

Note that there is no integer division operation. There are three common workarounds:

- Use a helper function.
- If dividing by a constant
`n`, you may be able to use the`UMULH`

instruction to multiply by (2⁶⁴÷`n`) and then extract the high 64 bits (which means to divide by 2⁶⁴). - Convert both values to floating point, perform a floating point division, and then convert the result back to an integer.

So hopefully you don't do a lot of integer division.

S4ADDxRa, Rb/#b, Rc ; Rc = Ra * 4 + Rb/#b S8ADDxRa, Rb/#b, Rc ; Rc = Ra * 8 + Rb/#b S4SUBxRa, Rb/#b, Rc ; Rc = Ra * 4 - Rb/#b S8SUBxRa, Rb/#b, Rc ; Rc = Ra * 8 - Rb/#b

The scaled addition and subtraction instructions multiply
`Ra` by 4 or 8 before adding or subtracting `Rb`/#b.
These are commonly used to calculate effective addresses
as part of an array indexing operation.

Next come the bit-twiddling instructions. Note that these instructions always operate on full 64-bit registers. (But if both inputs are in canonical form, then so too will the result.)

AND Ra, Rb/#b, Rc ; Rc = Ra & Rb/#b BIS Ra, Rb/#b, Rc ; Rc = Ra | Rb/#b "bit set" XOR Ra, Rb/#b, Rc ; Rc = Ra ^ Rb/#b BIC Ra, Rb/#b, Rc ; Rc = Ra & ~Rb/#b "bit clear" ORNOT Ra, Rb/#b, Rc ; Rc = Ra | ~Rb/#b EQV Ra, Rb/#b, Rc ; Rc = Ra ^ ~Rb/#b "bit equivalence"

Officially, the `C`

in `BIC`

stands for "complement",
but I find it easier to remember if I pretend that it stands for
"clear",
because it
clears the bits in `Ra`

as selected by `Rb/#b`

.
For example,

BIC t0, #3, t2 ; clear bottom two bits of t0

This takes the value in `t0`,
clears the bottom two bits (#3),
and puts the result into `t2`.

The `EQV`

and `ORNOT`

instructions are not
widely used, but I included them for completeness.

There are three bit-shifting instructions.

SLL Ra, Rb/#b, Rc ; Rc = Ra << (Rb/#b % 64) SRL Ra, Rb/#b, Rc ; Rc = (uint64_t)Ra >> (Rb/#b % 64) SRA Ra, Rb/#b, Rc ; Rc = ( int64_t)Ra >> (Rb/#b % 64)

The right-shift has two variants, depending on whether you want the shifted value to be zero-filled (unsigned, or logical shift) or sign-filled (signed, or arithmetic shift). Note that there are no 32-bit versions of the bit shifting instructions. They always operate on the full 64-bit register.

There are some rarely-used computation instructions that I'm not going to go into, like "count number of leading zero bits" and all the multimedia instructions. There are also some other computation instructions that are closely related to other functions of the processor, so I'll defer those to the appropriate section. Next time, we'll look at memory access, including the computation instructions tailored to support memory operations.

**Bonus chatter**:
There are a number of idioms that let you express other
concepts in terms of the instructions above.

BIS zero, zero, zero ; NOP (writes to zero are ignored) BIS zero, zero, Rc ; Set Rc to zero ADDL zero, #b, Rc ; Set Rc to a small constant SUBL zero, #b, Rc ; Set Rc to a small negative constant BIS Ra, Ra, Rc ; Copy Ra to Rc BIS zero, Ra, Rc ; Copy Ra to Rc BIS Ra, zero, Rc ; Copy Ra to Rc SUBxzero, Ra, Rc ; Rc = -Ra ORNOTxzero, Ra, Rc ; Rc = ~Ra ADDL zero, Rb, Rc ; Rc = (int64_t)(int32_t)Rb

Note that I gave three ways to copy one register to another. The first is the one recommended by DEC. The second is the one the Microsoft compiler generates. Windows NT requires that copying registers in function prologues and epilogues must be performed with one of the three given formats in order for the instruction to be unwindable.

I showed idioms for loading small positive and negative constants, but we'll see next time that there's something that works for medium-sized constants.

The last idiom is an important one because it forces a 32-bit value into canonical form. This is useful when there isn't a 32-bit version of the instruction you want, such as a shift instruction.

odd that the compiler generates a form that DEC didn’t recommend. I know they go out of their way to do that sort of optimization on x86

Integer division by a constant can be converted to a reciprocal multiplication and a correction step (usually a shift or add). Division by Invariant Integers using Multiplication is a widely cited paper on this topic.

Presumably that would be the part in the article where he says:

• If dividing by a constant n, you may be able to use the UMULH instruction to multiply by (2⁶⁴÷n) and then extract the high 64 bits (which means to divide by 2⁶⁴).

Ouch, the destination operand on the right!

If it weren’t for the fact I can’t get a modern HTTPS stack I would consider running NT4 on Alpha for a web server and laugh as memory corruption exploits that should yield SYSTEM don’t because their shellcode is for the wrong processor.

Those logical operations look rather like a bitwise-logic unit that can switch between AND, OR, and XOR, plus an optional inversion on the right operand. That plus the choice of which way round the operands are gives you all of the bitwise operations except nand and nor.

I like your observation so much I reordered the table to reflect it.

The instruction encoding supports this. The function codes (using Raymond’s ordering) are:

0x00 AND

0x20 BIS

0x40 XOR

0x08 BIC

0x28 ORNOT

0x48 EQV

I wonder that since all registers are 64 bit, will there be exception/interrupt thrown when you add 2 32-bit values with ADDL and result in carry/overflow.

If not, I think the “count number of leading zero bits” may in fact have some use.

How does the compiler produce 32-bit logical right shifts? If loading 32-bit value from memory will sign extend the value to 64-bit, the upper half might contain 1 bits. Then, using SRL would shift in 1s instead of zeroes.

To perform that, the compiler would need to zero out the upper half first. How does it do that optimally?

I can think only of a very inefficient way with the instructions presented so far (zero a register, shift logically left 32, then AND with the actual value).

Stay tuned!