Remember that in a stack trace, the addresses are return addresses, not call addresses

You may be faced with a stack trace like this:

00000000`001ebff0 00000000`ff6e2a94 ABC!CUIController::UpdateDisplay+0x156 [c:\src\abc\uicontroller.cpp @ 152]
00000000`001ec060 00000000`ff6e2f70 ABC!CUIController::displayEvent+0xea [c:\src\abc\uicontroller.cpp @ 930]
00000000`001ec090 00000000`ff6e2eef ABC!CEventRouter::fire+0x34 [c:\src\abc\eventrouter.cpp @ 998]
00000000`001ec0d0 00000000`ff6e3469 ABC!CEngineState::storeAndFire+0x126 [c:\src\abc\enginestate.cpp @ 653]
00000000`001ec110 00000000`ff6e4149 ABC!CEngine::SetDisplayText+0x39 [c:\src\abc\engine.cpp @ 749]

But when you go to look at, say, line 930 of the file uicontroller.cpp, you don't see a call to Update­Display:

// line 930
    DoSomethingElseEntirely(GetCurrentWidget(), false);

Why is the debugger saying that the call to Update­Display came from line 930 when there's no call to Update­Display anywhere in sight?

Recall that the stack trace extracts information from the stack. And the thing on the stack is the return address; in other words, it's the instruction that will execute after the called function returns. It's the instruction after the call instruction.

This means if that the call instruction was the last instruction for a line of code, then the return address will point to the first instruction of the next line of code.

Note also that the previous line of code might not be the one that comes before it in the source code.

    if (some_condition) {
        UpdateDisplay(); // line 924
    } else {
        OtherFunction(); // line 926

    // line 930
    DoSomethingElseEntirely(GetCurrentWidget(), false);

If the optimizer concludes (perhaps as a result of profiling feedback) tnat some_condition is nearly always true, it may decide to move the entire else clause out-of-line:

    cmp [some_condition], 0
    jz  rare_case
    call UpdateDisplay
    call GetCurrentWidget
    push 0
    push eax
    call DoSomethingElseEntirely

    call OtherFunction
    jmp  line_930

To see what line of code issued the call, you need to look at the address of the call.

0:00> u 0xff6e2a94-5 L1
ABC!CUIController::displayEvent+0xe5 [c:\src\abc\uicontroller.cpp @ 924]
00000000`ff6e2a8f call CUIController::UpdateDisplay

Aha, it was line 924.

Now, the actual call instruction might not have been a direct call. It could have been a memory indirect call, or a call through a register. I would just subtract 1 to get to the end of the previous instruction. It will disassemble as garbage, but that's okay.

0:00> u 0xff6e2a94-1 L1
ABC!CUIController::displayEvent+0xe9 [c:\src\abc\uicontroller.cpp @ 924]
00000000`ff6e2a9e add al, ch

Bonus chatter: This reminds me of a quirk of the 6502 processor: When it pushed the return address onto the stack, it actually pushed the return address minus one. This is an artifact of the way the 6502 is implemented, but it results in the nice feature that the stack trace gives you the line number of the call instruction.

Of course, this is all hypothetical, because 6502 debuggers didn't have fancy features like stack traces or line numbers.

Comments (19)
  1. Pierre B. says:

    That’s the easy case. I understand that some of your posts are of an easier level but I have a hard time imagining a programmer proficient ebough to use a debugger and yet not smart enough to see the preceding lines.

    The mysterious cases we encounter again and again involve either inlining or tail-call optimizations. In the first case, the call is invisible inthe source because it is actually within another function that got inlined. In the second case, the calling function is not on teh stack because it removed all traces of itself before calling as a tail optimization.

    1. Joshua says:

      If you think tail call is bad, try crashing in a trampoline.

  2. Antonio Rodríguez says:

    I was thinking of the -1 thing in the 6502 all the time while reading the article. I learned it the hard way while trying to implement a BRK instruction handler to implement virtual instructions, but that’s another story for another time :-) .

    Surely most programmers didn’t get those luxuries, because the most widely used 6502 debugger, the Apple II Monitor, didn’t have them. Wozniak managed to fit a basic OS with switchable I/O streams (remember typing 1 to direct Monitor’s output to the printer on slot 1 or 3 to activate the 80-column card on slot 3?), a basic toolbox, a debugger and a disassembler, all of it in just under 2 KB of ROM!

    1. Thomas Hate says:

      On the 6502 topic though: since most instructions are multi-byte, PC-1 is rarely the start of the line before. It’s much more often half or two-thirds of the way through it.

      This pedantic point, I could not contain.

      1. Antonio Rodríguez says:

        The 6502 only has one subroutine call instruction (JSR, opcode 0x20), only with absolute addressing mode (meaning that the operand is always a 16-bit address, two bytes wide), so you can safely assume that the address stored in the stack is the second byte of the call address, and that the previous instruction (the JSR) is two bytes before. In case you doubt if the call has been forged by, say, pushing a 16-bit address to the stack and then jumping to the subroutine, you can double check that the byte two bytes before the pointed address contains a 0x20.

        If you are writing an interrupt service routine, things can get more interesting, because, obviously, the program can be interrupted at any point (the code interrupted may not even be in your own code, but in the OS or the firmware). But in that context, stack dumps don’t make so much sense.

        1. smf says:

          “The 6502 only has one subroutine call instruction (JSR, opcode 0x20)”

          Officially, but……


          It’s impossible to find the calling instruction on any cpu.

          1. Stephen says:

            This is a synthetic JMP rather than a JSR.

          2. Simon Farnsworth says:

            There’s no way to save PC without using JSR – while you could JSR to a synthetic JMP like the one you’ve described, you’d still know where the calling instruction was.

            You could work around this by pushing the intended return address and JMPing to a subroutine, if you were obfuscating things.

          3. Stephen says:

            An IRQ also saves it. The could then do some stack cleanup followed by a CLI and a JMP.

          4. Simon Farnsworth says:

            To abuse an IRQ needs external hardware that supports generating an IRQ on demand, and is thus a system property, not a 6502 property.

            Not that that would stop people writing “copy protection” systems for games and the like…

  3. voo says:

    This seems pretty obvious, but the same applies also when using continuation passing style and similar trickeries in which case the return address usually has nothing to do with what called you.

    One of the most prominent examples of this is C#’s async implementation which can lead to lots of confusion among programmers during debugging sessions. Admittedly it does make debugging more complicated since some valuable information is lost.

    1. Alex Cohn says:

      Lambdas and shared_ptr make it great fun in C++, too.

  4. laonianren says:

    Funnily enough, in the last month I’ve just been looking at adding fancy features to a 6502 debugger. It would be fun, but I don’t do enough 6502 debugging to really justify the effort. Though every time I do, I wish the debugger was better.

    1. Antonio Rodríguez says:

      These days, the debuggers built into emulators are wonderful. They allow code and data breakpoints, memory and register examination and change, and step-by-step interactive execution. Many emulators have snapshot capabilities, which let you capture a known state and go back to it as many times as needed. All those features are nearly impossible to implement in a native debugger, running in the same machine.

      We live wonderful times!

  5. Neil says:

    If having a return address minus one makes 6502 stack traces better, why can’t x86 debuggers subtract one from return addresses before calculating the source line number?

    1. poizan42 says:

      The length of call instructions varies wildly on x86, so it would have to use heuristics to try and guess where the previous instruction started.

    2. @Neil: Yeah, that would be nice, but they would also have to rename the column from “Return address” to “Approximate caller address”.

      @poizan42: The debugger doesn’t need to find the exact call instruction. It just needs to find something that is part of the call instruction, so it can convert it to a symbol better.

      1. poizan42 says:

        That’s assuming that symbols are available. I was thinking a bit about that, it would be inconsistent to show call address when symbols are available and return address when they are not. Maybe a better solution would be to add another column and show both, a heuristic probably wouldn’t be so bad then either.

  6. Tony Konzel says:

    In the course of working with 6502 code, I started to see this little quirk used to implement a cheap jump table. Get low byte from a table using an index register, push it, get high byte with the same register, push it, RTS. Just had to be careful that the addresses in the table were all the desired address – 1.

Comments are closed.

Skip to main content