The MIPS R4000, part 4: Constants


Since the MIPS R4000 has a fixed 32-bit instruction size, it cannot have a generalized "load 32-bit immediate constant" instruction. (There would be no room in the instruction for the opcode!)

If you look at the integer calculations available, you see that there are some ways of generating constants in a single instruction.

Constants in the range 0x00000000 to 0x0000FFFF can be generated in one instruction by using ORI, which treats its 16-bit immediate as an unsigned value.

    ORI     rd, zero, imm16

Constants in the range 0xFFFF8000 to 0xFFFFFFFF can be generated with the ADDIU instruction, which treats its 16-bit immediate as a signed value.

    ADDIU   rd, zero, imm16

If we had a NORI instruction, then we could have used it to generate constants in the range 0xFFFF0000 to 0xFFFFFFFF:

    NORI    rd, zero, imm16

But alas that instruction doesn't exist.

To build 32-bit values that cannot be created with these one-instruction tricks, you can use the LUI instruction, which means "load upper immediate".

    LUI     rd, imm16           ; rd = imm16 << 16

It loads the 16-bit immediate value into the upper 16 bits of the destination register and zeroes out the bottom 16 bits. You can then follow this up with an ORI to finish the job:

    LUI     rd, XXXX            ; rd = XXXX0000
    ORI     rd, rd, YYYY        ; rd = XXXXYYYY

There is a data dependency here, and you might expect a pipeline bubble because the ORI depends on the result of the previous instruction, which won't be available until the write-back stage four cycles later. However, the processor supports integer arithmetic forwarding: The result of an arithmetic operation produced in the execute stage can be fed directly to the execute stage of the next instruction, thereby avoiding a stall.

Since the constant is loaded up 16 bits at a time, when a module needs to be relocated, moving it by a multiple of 64KB permits the fixup to be applied only to the XXXX part, leaving the YYYY part alone. (Previous discussion.) This is a very useful property, because in practice, these two instructions may not be adjacent to each other. The compiler might choose to interleave other calculations to avoid the data dependency stall.

There are a few pseudo-instructions provided by the assembler for loading 32-bit constants.

    LI      rd, imm32           ; rd = imm32 (by whatever means)
    LA      rd, global_variable ; rd = address_of global_variable

The LI pseudo-instruction loads a 32-bit immediate into rd using a single-instruction trick if available; otherwise, it uses the two-instruction sequence.

The LA pseudo-instruction does the same thing, but the 32-bit value comes from the address of a global variable and is consequently subject to a relocation fixup.

Next time, we'll look at aligned memory access.

Comments (11)
  1. Medinoc says:

    At least this looks easier than on Alpha AXP.

    1. kantos says:

      Given just how often constants are used in code you’d assume RISC ISAs would make loading them easier. I suppose you could have a constants page for each process/module that you just always keep loaded in a register. But that has cache issues, not to mention requiring an extra save/restore on any external function for a module.

      1. Justin says:

        How would you propose they do it? With fixed length instructions it’s impossible to load a full word (or quad, for 64-bit archs) in a single instruction. On PowerPC there are similar instructions to MIPS — addi/addis ori/oris which are used to load constants, and if there are a bunch of constants needed, the compiler will often generate a PC-relative load, using the construct:

        bl . + 4
        mflr rN

        … load all constants via lwz or ld, relative to rN

      2. Antonio Rodríguez says:

        Most constants actually used in code are 16-bit or less, so they can be loaded in a single instruction.

      3. Voo says:

        As Antonio says most constants fit into 16 bit (and the by far most used constant 0 is as easy to use as possible) and if you really need a larger constant in a function you can just put it before the code which means it will be almost guaranteed to be in the l2 cache (that’s how some compilers handle this for AArch64 in any case).

      4. Evan says:

        > I suppose you could have a constants page for each process/module that you just always keep loaded in a register.

        The usual way of doing this on ARM at least (or the usual way before movt and movw instructions became available; movt (“mov top”) loads the upper 16 bits of a register from an imm16, and movw loads the lower 16 bits from an imm16) is to have constant pools after each function, or for huge functions, in the middle of one. You can load from a location within some distance of the current program counter. It’s sorta like your idea, except that you don’t need to burn a register to remember that page — you just use the program counter for that register.

        (It does mean that you have non-executable data mixed in with your executable data, which could in theory have security implications. I suspect that’s not much of a concern in practice, but I can’t say that for sure.)

        1. Kevin says:

          > It does mean that you have non-executable data mixed in with your executable data, which could in theory have security implications. I suspect that’s not much of a concern in practice, but I can’t say that for sure.

          It would interfere with DEP, except that to the best of my knowledge, this architecture never had DEP in the first place. If you really had to do this on a DEP-aware system and architecture, you could probably leave the page marked executable, and presumably unmarked as writable (These are constants, right? Did you trick FORTRAN 77 into changing the value of 3 or something?).

          1. Voo says:

            While the API offers a difference between PAGE_EXECUTE and PAGE_EXECUTE_READ in practice no architecture I know supports such a thing.
            You’d need a strict Harvard process for the distinction to make sense.

            Having a constant pool doesn’t harm DEP in any way. What it might do is allow being used in spectre and meltdown attacks, but that’s way too complicated for me to comment on.

  2. Evan says:

    > Having a constant pool doesn’t harm DEP in any way. What it might do is allow being used in spectre and meltdown attacks, but that’s way too complicated for me to comment on.

    What I was actually thinking of with my statement was that it would increase the pool of available options for ROP gadgets. My guess is it wouldn’t meaningfully affect things, but I could be wrong.

    1. mikeb says:

      It seems like an attacker would need two things:

      1) to discover constant data that turned out to consist of one or more useful instructions
      2) figure out a way to return (or jump) to those instructions in a useful way

      But since the code pages are already full of actual instructions, it’s not clear to me how having constant data in among those instructions increases the chance of an exploit in any significant way.

      Then again, I’m no expert on exploits – I’m the kind of guy that would a submit a bug that turned out to involve “being on the other side of this airtight hatchway thing”.

      1. Evan says:

        > But since the code pages are already full of actual instructions, it’s not clear to me how having constant data in among those instructions increases the chance of an exploit in any significant way.

        I strongly suspect it’s not significant as well, but there could in theory be some bit pattern that is a useful instruction for a ROP program that does not otherwise appear. I did say that this is probably more of a theoretical weakness than a practical one. :-)

Comments are closed.

Skip to main content