Microspeak: Zap


You may hear an old-timer developer use the verb zap.

That proposed fix will work. Until everybody gets the fix, they can just zap the assert.

The verb to zap means to replace a breakpoint instruction with an appropriate number of NOP instructions (effectively ignoring it).

The name comes from the old Windows 2.x kernel debugger. (Actually, it may be even older, but that's as far back as I was able to trace it.) The Z (zap) command replaces the current instruction with a NOP if it is an int 3 (the x86 single-byte breakpoint instruction), or replaced the previous instruction with NOPs if it is an int 1 (the x86 two-byte breakpoint instruction).

This operation was quite common back in the days when lots of code was written in assembly language. A technique used by some teams was to insert a hard-coded breakpoint (called a TRAP) into every code path of a function. Here's an example (with comments and other identifying characteristics removed and new ones made up):

xyz8:   mov     bl,[eax].xyz_State
        cmp     bl,XYZSTATE_IGNORE
        TRAPe
        je      short xyz10     ; ignore this one
        or      bl,bl
        TRAPe
        je      short xyz11     ; end of table

        mov     bh,[eax].xyz_Flags
        test    bh,XYZFLAGS_HIDDEN
        TRAPz
        jz      short xyz10     ; skip - item is hidden
        test    bh,XYZFLAGS_MAGIC
        TRAPe
        je      short gvl10     ; skip - not the magic item
        TRAP
        bts     [esi].alt_flags,ALTFLAGS_SEENMAGIC
        TRAPc
        jc      short xyz10     ; weird - we shouldn't have two magic items

There were a variety of TRAP macros. Here we see the one plain vanilla TRAP and a bunch of fancy traps which trigger only when certain conditions are met. For example, TRAPc traps if the carry is set. Here's its definition:

TRAPc   MACRO
        local   l
        jnc     short l
        int     3
l:
        ENDM

Hardly rocket science.

When you became the person to trigger a particular code path for the first time, you would trigger the trap, and you either stepped through the code yourself or (if you weren't familiar with the code) contacted the author of the code to verify that the code successfully handled this "never seen before" case. When sufficiently satisfied that a code path operated as expected, the developer removed the corresponding TRAP from the source code.

Of course, most TRAPs are removed before the code gets checked in, but the ones related to error handling or recovering from data corruption tend to remain (such as here, where we inserted a TRAP when we encounter two magic items, which is theoretically impossible).

When you trigger one trap, you usually trigger it a lot, and you usually trigger a lot of related traps as well. The Z command was quite handy at neutering each one after you checked that everything was working. You zapped the trap.

That's why old-timers refer to patching out a hard-coded breakpoint as zapping, even though the zap command hasn't existed for over a decade.

Update: As far as I can tell, the earlier uses of the word zap referred to patching binaries, not for removing hard-coded breakpoints after they stopped in the debugger.

Comments (23)
  1. Anonymous says:

    I do a similar thing as I write a new function or class.  I set breakpoints at the beginning of each branch.  Then I run my unit tests in the debugger, removing each breakpoint as I verify the operation.  When I’m done, if there are still breakpoints, then I’ve either got a bug, or I need to add more cases to the unit test.

    Everything old is new again.

  2. Anonymous says:

    @Adrian:

    So, that would be The New Old Thing?

  3. Anonymous says:

    but the brain cells are getting rusty.

    As are the typing fingers, apparently.

  4. Anonymous says:

    AMASPZAP aka "Superzap" is an IBM mainframe utility dating back to at least the mid sixties.  I believe the original SuperZAP came from SLAC (Stanford Linear Accelerator Center) and was repackaged by IBM as a service tool IMASPZAP.

    The basics were control statements to locate a module/program, verify content by offset, and replace content by offset.  

    It would also dump modules in a pretty printed format to help a patch creator get oriented, and would work on disk tracks (raw data) as well as programs.

    PS:  Us old timers would also "zap" programs in memory using the console switches and lights.  

    Perhaps the hardest thing for us to get used to about them new fangled PCs was the idea of zero lights and only 1 switch!

  5. Anonymous says:

    "The name comes from the old Windows 2.x kernel debugger."

    Was that SYMDEB?

    And on the matter of the TRAP* macros, in fact x86 has a built-in INTO instruction to cause INT 4 when the overflow flag is set.

  6. Anonymous says:

    AMASPZAP aka "Superzap" is an IBM mainframe utility dating back to at least the mid sixties.  I believe the original SuperZAP came from SLAC (Stanford Linear Accelerator Center) and was repackaged by IBM as a service tool IMASPZAP.

    The basics were control statements to locate a module/program, verify content by offset, and replace content by offset.  

    It would also dump modules in a pretty printed format to help a patch creator get oriented, and would work on disk tracks (raw data) as well as programs.

    PS:  Us old timers would also "zap" programs in memory using the console switches and lights.  

    Perhaps the hardest thing for us to get used to about them new fangled PCs was the idea of zero lights and only 1 switch!

  7. Anonymous says:

    Isn’t the x86 2-byte breakpoint instruction also int 3 (cdh, 03h vs. cch)? Since int 1 is the single-step interrupt, you wouldn’t need to put in code, as it is automatically triggered after every instruction when active.

    [The assembler automatically chooses the 1-byte encoding for “int 3”. There’s no easy way to ask for the 2-byte version. -Raymond]
  8. Anonymous says:

    Is it a coincidence that "Z" is ASCII 90 and 0x90 is also the instruction code for "NOP" in x86?

  9. Anonymous says:

    “[The assembler automatically chooses the 1-byte encoding for “int 3”. There’s no easy way to ask for the 2-byte version. -Raymond]”

    Apart from “db cdh, 03h” of course. That’s not easy?

    [I don’t consider generating code via “db” to be “easy”. Or readable for that matter. -Raymond]
  10. Anonymous says:

    As an amusing aside…The term "zap" lives on in Windows thanks to Driver Verifier, which for some errors will ask if you want to:

    Break, Ignore, Zap, Remove, Disable all (bizrd)?

    Though in this context it simply causes Verifier to no longer break in to the kernel debugger when this violation occurs, instead it just prints a message and moves on.

    -scott

  11. Anonymous says:

    So under what circumstances would an int 1 appear in code?

    [Some teams used “int 3” to mean “code coverage breakpoint” and “int 1” to mean “assertion failure breakpoint.” -Raymond]
  12. Anonymous says:

    I remember using various "zap" utilities on the TRS-80 in the early ’80s, running under a Microsoft operating system (TRS/DOS 1.3), so it seems reasonable that Microsoft knew and used the term then.

    I was working with someone at the time who had worked at NCR and IBM in the ’60s and ’70s, and remember talking about the history of zap and Superzap.  Definitely older than the Windows 2.x debugger.

  13. Anonymous says:

    @janm

    TRSDOS, and its many and better compatible alternatives (such as NewDOS/80, UltraDOS, LDOS, VTOS, etc. etc.) had nothing whatsoever to do with Microsoft.  TRSDOS was written by Randy Cook under contract from Radio Shack, and shows a strong DEC TOPS influence.  The TRS-80 Superzap you used was introduced in the original NewDOS from Apparat but picked up by other DOSes later.  Superzap was a hex editor for disks.  Probably the term came from the IBM mainframe usage mentioned earlier.

  14. Anonymous says:

    Agree with Electron Shepherd; ‘zap’ is the generic term for what is also termed ‘patch’.  There was no implication that the zapped instructions would be nop’d.

    Numerous DEC operating systems in the 1970s had a program called ‘zap’ that modified binaries.

    I’m pretty certain IBM opsyses (what *is* the lural of opsys?) did too, but the brain cells are getting rusty.

  15. Yuhong Bao says:

    [Some teams used “int 3” to mean “code coverage breakpoint” and “int 1” to mean “assertion failure breakpoint.” -Raymond]

    And INT 1 has a undocumented single-byte form too, 0xF1.

    [And this is relevant how exactly? -Raymond]
  16. Anonymous says:

    Hi Raymond, I am terribly sorry if this is not the right place for suggestions or queries. The Suggestion Box post seems to be closed for comments.

    Anyway … a Mac friend pointed out to me how the Refresh option is not present on OS X. Why does the Explorer on Windows need a Refresh option? Why isn’t the Refresh automatic and instantaneous? I would love to know the historic reasons for this :-)

  17. Anonymous says:

    @Ashwin Nanjappa:

    Probably for similar reasons to the ones that cause people to write OSX extensions, eg. http://lifehacker.com/252956/download-of-the-day-refresh-finder-mac.

    Not all network drives broadcast updates — I imagine that would be a performance nightmare in some cases — so you sometimes might need to refresh manually.

    Windows (at least since Vista — it’s been too long since I used XP to remember) does do auto-refresh, which works most of the time and about as well as the one introduced in (IIRC) OSX 10.4 in my experience.

  18. Anonymous says:

    [I don’t consider generating code via “db” to be “easy”. Or readable for that matter. -Raymond]

    But… your post suggest you’re using macros already. It’s not going to harm readability that way, right?

    I remember that when Pentium CPU with MMX is released, the community at Borland newsgroup published macro add-ins that allow TASM support MMX immediately, before the next release. And I think it’s good.

    [True, if you’re using a macro then it doesn’t matter. But who would want to go out of their way to use a “large breakpoint”? The int 1 two-byte breakpoint wasn’t chosen because somebody said, “Hey, I want a two-byte breakpoint.” It was chosen because somebody said “I need a breakpoint, and oh well looks like this one is two bytes.” -Raymond]
  19. Anonymous says:

    I use a macro

    #define NOT_TESTED ASSERTE(0)

    … for same purpose. The more things change, the more they stay the same, eh?

  20. Anonymous says:

    Obligatory rocket science link.

  21. Anonymous says:

    @Goran: An enhanced version would be:

    #define NOT_TESTED ASSERTE(!"This code path has not been tested");

  22. Yuhong Bao says:

    The int 1 two-byte breakpoint wasn’t chosen because somebody said, “Hey, I want a two-byte breakpoint.” It was chosen because somebody said “I need a breakpoint, and oh well looks like this one is two bytes.” -Raymond]

    And the funny thing is that later the 386 introduced a one byte version of INT 1 (0xF1), as I posted above. It was undocumented as it was originally intended for ICE debugging, but it works to generate an Int 1 when no ICE is attached.

    [I ask again, “And this is relevant how exactly?” Are you suggesting that Windows should have used an undocumented opcode that worked only on some processors? “Hey everybody, I’m going in and modifying all our source code and changing all INT 1 instructions to a new INT1 macro that takes advantage of an undocumented opcode that works only on some processors and which our debugging tools don’t understand. Please do not use the INT 1 instruction in the future; use the INT1 macro from now on. Why am I doing this? Um, let me get back do you.” Good luck with that. -Raymond]

Comments are closed.