Why does the compiler generate a MOV EDI, EDI instruction at the beginning of functions?

Why does the compiler generate a MOV  EDI, EDI instruction at the beginning of functions?


I’ve recently noticed that on the XPSP2 Beta that I am running the function prologs look like this:    


     MOV    EDI, EDI

     PUSH   EBP

     MOV    EBP, ESP


The PUSH  EBP and MOV EBP, ESP instructions are standard frame establishment, but what is the purpose of the MOV EDI,EDI instruction?  Seems like a 2-byte NOP instruction.


MOV EDI,EDI is indeed a 2-byte no-op that is there to enable hot-patching.   It enables the application of a hot-fix to a function without a need for a reboot, or even a restart of a running application.   Instead, at runtime, the 2-byte NOP is replaced by a short jump to a long jump instruction that jumps to the hot-fix function.   A 2-byte instruction is required so that when patching the instruction pointer will not point in a middle of an instruction.


Comments (3)

  1. Ziv Caspi says:

    It’s interesting they’ve chosen this method for patching. It not only requires 2 extra bytes per entry, but it also requires you put gaps every once in a while between methods for the 5-bytes long jump (in theory, one gap per method, if you need to patch them all).

    It does have an advantage if you have extremely short functions (less than 5 bytes required for a long jump) or in cases where functions share code, but these could have been taken care of by a simple modification of the compiler in any case.

    Do you know why they didn’t rely on in-place patching like what is done with Detours?

    (BTW — Welcome aboard, Ishai!)

  2. Ishai says:

    Detours change a binary file offline. Hot patching is done on a running executable and they want to guarantee that the instruction pointer does not point in the middle of the patched area.

    Using the Detours method on a live process would require suspending threads and making sure no thread instruction pointer is pointing at the second, third, forth, or fifth byte of a function that is being Detoured and handling the case that it does.

    A Detour will also put limitation on the code generation (i.e. never jump to instructions in bytes 2-5).

    Seems to be possible but more complicated than placing a gap between functions and ensuring a 2-byte first instruction.