What happened in real-mode Windows when somebody did a longjmp into a discardable segment?


During the discussion of how real-mode Windows handled return addresses into discarded segments, Gabe wondered, "What happens when somebody does a long­jmp into a discardable segment?"

I'm going to assume that everybody knows how long­jmp traditionally works so I can go straight to the analysis.

The reason long­jmp is tricky is that it has to jump to a return address that isn't on the stack. (The return address was captured in the jmp_buf.) If that segment got relocated or discarded, then the jump target is no longer valid. It would have gotten patched to a return thunk if it were on the stack, but since it's in a jmp_buf, the stack walker didn't see it, and the result is a return address that is no longer valid. (There is a similar problem if the data segment or stack segment got relocated. Exercise: Why don't you have to worry about the data segment or stack segment being discarded?)

Recall that when a segment got discarded, all return addresses which pointed into that segment were replaced with return thunks. I didn't mention it explicitly in the original discussion, but there are three properties of return thunks which will help us here:

  • It is safe to invoke a return thunk even if the associated code segment is in memory. All that happens is that the "ensure the segment is present" step is a nop, and the return thunk simply continues with its work of recovering the original state.
  • It is safe to abandon a return thunk without needing to do any special cleanup. All the state used by the return thunk is stored in the patched stack itself, so if you want to abandon a return thunk, all you need to do is free the stack space.
  • It is safe to reuse a return thunk. Since they are statically allocated, you can use them over and over as long as the associated code segment has not been freed.

The first property (idempotence of the return thunk) is no accident. It's required behavior in order for return thunks to work at all! After all, if the segment was loaded (say by a direct call or some other return thunk), then the return thunk needs to say, "Well, I guess that was easy," and simply skip the "load the target segment" step. (It still needs to do the rest of the work, of course.)

The second property (abandonment) is also no accident. An application might decide to exit without returning all the way to Win­Main (the equivalent of calling Exit­Process instead of returning from Win­Main). This would abandon all the stack frames between the exit point and the Win­Main.

The third property (reuse) is a happy accident. (Well, it was probably designed in for the purpose we're about to put it to right here.)

Okay, now let's look at the jump buffer again. If you've been following along so far, you may have guessed the solution: Pre-patch the return address as if it had already been discarded. If it turns out that the segment was discarded, then the return thunk will restore it. If the segment is present (either because it was never discarded, or because it was discarded and reloaded, possibly at a new address), the return thunk will figure out where the code is and jump to it.

Actually, since the state is being recorded in a jmp_buf, the tight space constraints of stack patching do not apply here. If it turns out you need 20 bytes of memory to record this information, then go ahead and make your jmp_buf 20 bytes. You don't have to try to make it all fit inside an existing stack frame.

The jmp_buf therefore doesn't have to try to play the crazy air-squeezing games that stack patching did. It can record the return thunk, the handles to the data and stack segments, and the return IP without any encoding at all. And in fact, the long­jmp function doesn't need to invoke the return thunk directly. It can just extract the segment number after the initial INT 3Fh and pass that directly to the segment loader.

(There is a little hitch if the address being returned to is fixed; in that case, there is no return thunk. But that just makes things easier: The lack of a return thunk means that the return address cannot be relocated, so there is no patching needed at all!)

This magic with return thunks and segment reloading is internal to the operating system, so the core set­jmp and long­jmp functionality was provided by the kernel rather than the C runtime library in a pair of functions called Catch and Throw. The C runtime's set­jmp and long­jmp functions merely forwarded to the kernel versions.

Comments (14)
  1. 9k08 says:

    It continues to amaze me how much real mode Windows tried to accomplish with so little to work with…

  2. Joshua says:

    Well there's two more functions to add to the list of what to expect as platform functions:

    we now have: div, _ldiv, memcpy, memmove, memset, sbrk, setjmp, longjmp

  3. Theo says:

    After all the work that had to be done before this, it's quaint to see that in this case it was basically a none-issue.

  4. CarlD says:

    It is fun to remember the pains we went through "way back when".  Having implemented an overlay manager for 16 bit CP/M back in the days before Windows, we had to solve all these same problems – and came up with very similar solutions.

  5. Ken Hagan says:

    @9k08: Agreed, but you tell that to the kids these days and they just don't believe you.

  6. jonwil says:

    I do wonder if the special kernel functions were publicly documented (allowing other compiler vendors to use them where necessary), "documented" (in that some compiler vendors and others were given the info and/or it existed out there but not officially) or kept secret so only Microsoft had proper working code.

    Even today its (to the best of my knowledge) not possible to find official documentation describing how the machinery behind __declspec(thread) or win32 SEH works (the relavent code is not included in the CRT source with any version of Visual Studio I have ever seen)

  7. Joshua says:

    @ionwil: Try looking in the PE file format specifications.

  8. Yuhong Bao says:

    @jonwil: Ah, reminds me of this: antitrust.slated.org/…/PX00342.pdf

    I don't think "Open Tools" with documentation for things like InitTask, InitApp, and WaitEvent came until 1991.

  9. Yuhong Bao says:

    @jonwil: In fact, it also reminds me of how Skywing described how Windows x64 SEH implemented setjmp/longjmp as a "layering violation":

    http://www.nynaeve.net/?p=105

  10. poizan42 says:

    @Jushua: Well according to msdn.microsoft.com/…/ms686708(v=vs.85).aspx:

    You should not directly access this structure. To access the values of the TlsSlots and TlsExpansionSlots members, call TlsGetValue.

    It seems like someone at Microsoft thinks that third party compilers should not be allowed to generate as efficient code as Microsoft's own compilers.

    [The "you" in that sentence is "you, the application writer." Selected parts of the TEB are documented for compiler authors. But most people reading MSDN are not compiler authors. If you are a compiler author, then you already have the "extra stuff just for compiler authors" document because you also need to know about all sorts of stuff that isn't interesting to an application author, like the rules on code generation so the exception unwinder will know how to unwind your stack. There are other categories of specialty software like accessibility tools vendors and anti-malware software vendors. -Raymond]
  11. Danny says:

    Ahhh, long jump in a discardable segment – the best way to write an undetected polymorphic virus. Then SoftIce came along and those discardable segments were accessible for a debugger.

  12. Damien says:

    Exercise: Why don't you have to worry about the data segment or stack segment being discarded?

    I was thinking along the lines, and then went back and read the original article that started this all off – code segments could only *be* discarded because it was easy to regenerate the content of those segments – just reload the code from disk. Data and stack segments were updatable and never got discarded.

  13. Neil says:

    For those of you who are still annoyed that Firefox doesn't follow comment links, the squeaky wheel finally got too much for me and I've submitted a patch. Expect to see it working in Firefox 29.

  14. Deduplicator says:

    I cannot find the original Microsoft document which started this series way back when. Can it still be found on the web?

Comments are closed.

Skip to main content