Debugging Rotor with GDB

I'm posting up a “getting started“ style document that previous members of the Rotor team cooked up, it illustrates some notes for debugging under the GDB environment (FreeBSD and MacOS). Enjoy.

Launching GDB

Debugging a new instance of an application

Run "gdb app_name", then at the "(gdb)" prompt, enter "run [arguments]" to spawn the new instance and pass along optional command-line arguments to it. For example:

gdb clix

(gdb) run ~/rotor/tests/bvt/short/hello.exe

This will launch a new clix process, passing along the path to a managed exe to run.

Attaching gdb to a running process

Run "gdb app_name", where app_name is the name of the running process. At the "(gdb)" prompt, type "attach pid", where "pid" is the process ID of the process to attach to. Specifying the app name on GDB's command line allows GDB to load symbols for the process.

Setting Breakpoints

GDB cannot set breakpoints in .so files that are not currently loaded, though it does keep the .so's symbols loaded even if the .so unloads. So in order to set a breakpoint in a .so, you must wait until the .so is actually loaded. For example, to set breakpoints in libsscoree.so within a clix process, you must:

  • Set a breakpoint in clix, after it loads libsscoree.so. ie. "break main.cpp:235" then "run ~/rotor/tests/bvt/short/hello.exe"
  • Once the breakpoint in clix hits, libsscoree.so has been loaded, so breakpoints in libsscoree.so can be set. ie. "break _CorExeMain2"

Breakpoints may be set in the PAL once main() has been called.

Restarting the Debugging Session

Since GDB cannot set breakpoints in not-loaded .so files, and restarting the process unloads the .so files, GDB doesn't support breakpoints set in .so files when restarting.

So before restarting a debugging session, use "info break" to show the list of currently-set breakpoints, then use "dis n [ n...]" to disable all of the breakpoints set outside of the application itself. Multiple breakpoints can be disabled by one "dis" command - the list to disable is space-separated.

Once those breakpoints are disabled, the session can be restarted by typing "run [arguments]". If no arguments are specified, then gdb uses the arguments from the previous "run" command, which saves you from having to retype long arguments on each restart. GDB will prompt before restarting, to kill the current session.

Basic Commands

General Info

  • GDB uses Emacs-style line editing. Tab completes command names, Ctrl+a moves to start of line, Ctrl+e moves to end, etc.
  • "help" shows some basic help
  • "quit" exits

Breakpoints

  • "break sourcefile:linenum" - set a breakpoint at sourcefile:linenum. The command can be abbreviated to just "br". Note that the sourcefile:linenum tends to not work correctly for assembly files (*.s) and for included .cpp files (rotor_x86/*.cpp).
  • "break symbolname" - set a breakpoint on the symbol name. C++ class members are specified as classname::membername.
  • "break *address" - set a breakpoint at the specified address. The address defaults to decimal - use "0x" for hex. ie. "break *0x1234"
  • "break" - sets a breakpoint on the current line. If this is used when the current stack frame is not the bottom frame, then it sets a breakpoint on the line following the call to the child function, so it can be used to break when the child returns
  • "xbreak" - set a breakpoint on the return from the current function
  • "info break" - list all breakpoints
  • "delete n" - delete a breakpoint by number. More than one can be specified at once - use a space as the list separator
  • "disable n" - disable breakpoints by number. More than one can be specified at once. The abbreviation is "dis"
  • "enable n" - enable breakpoints by number. The abbreviation is "en"
  • "ignore n m" - alters breakpoint number 'n', changing it to have an ignore-count of 'm'.
  • "condition n (expression)" - alters breakpoint number 'n', so it breaks only when the 'expression' evaluates to true. The expression must be within parentheses.
  • "break if (expression)" combines "break" and "condition" into one statement

Callstacks and Local Variables

  • "backtrace" - dump stack backtrace. Each stack frame is numbered, with 0 being the bottommost frame. Can be abbrevated "bt"
  • "frame n" - switch to stack frame 'n'. Can be abbreviated "fr n"
  • "info locals" - print the values of all local variables within the frame
  • "info frame" - print everything GDB knows about the stack frame - its arguments, values of saved registers, etc.

Examining Data

  • "x /nfl expression" - examine (ie. print) the contents of the expression. 'n' is the number of instances to dump, and the default is 1. The 'f' is a format specifier:
    x - hex
    d - decimal
    u - unsigned decimal
    t - binary
    f - float
    a - address (also looks up symbolic info)
    i - instruction
    c - char
    s - string
    and 'l' is the size:

    b - byte
    h - halfword
    w - word
    g - 8 bytes
    and 'l' may be blank. ie. "x /8xw $esp" dumps 8 hex 32-bit integers from the stack. Other examples:

    "x /ni address" dumps 'n' instructions starting from address
    "x /nhc address" dumps a PAL Unicode string (see Useful Macros for another way to dump them.)

  • "printf" - works the same as C runtime printf, but omit the parentheses. ie. 'printf "%s", argv[0]'

  • "ptype typename" - show the structure fields etc. for the typename

  • "whatis expression" - show the type of the expression. ie. "whatis argc" in the stack frame for main() prints "type = int"

Source-level Debugging

  • "step", "next" - single-step, and step over. Abbreviations are "s" and "n"
  • "finish" - execute the bottommost function on the stack, until it returns. It can be abbreviated "fin".
  • "continue" - continue running (ie. after a breakpoint). The abbreviation is "c".

Note that "step" and "next" may occasionally fail to step as expected, and will break with a message like:

0x28074f60 in _init () from /home/barrybo/rotor/build/librotor_pal.so

When this happens, enter "fin" once and you should return back into the code you're debugging. This happens only on function calls within the PAL when calling another function within the PAL.

Assembly-level Debugging

GDB has assembly-level debugging, but the source-level debugging features tend to interfere. For example, GDB does not automatically print out the CPU registers when single-stepping at the instruction level. Some hints:

  • CPU register names are available symbolically, by using "$registername". ie. "$eax" or "$esp"
  • Before doing assembly-level debugging, enter "display /i $eip", which instructs GDB to disassemble the instruction at $eip each time it prompts. Use "undisplay" to remove this.
  • The "disassemble" command disassembles assembly instructions. "disassemble symbol" or "disassemble address" will disassemble the entire function containing the symbol/address. There are two workarounds: use "disassemble startaddress endaddress" to disassemble a range, or use "x /30i address" to disassemble 30 instructions (or whatever number you want). Note that disassembly is in AT&T style, not Intel-style like the *.s source files use. Use "set disassembly-flavor intel" to switch, and "set disassembly-flavor att" to switch back.
  • "stepi" and "nexti" single-step and trace instructions. "nexti" does not work while debugging jitted code: use "tbreak *address" to set a temporary breakpoint, then 'c' to continue until the breakpoint is hit.

Threads

  • "info threads" - shows the list of all threads
  • "thread n" - switch to thread #n
  • "thread apply n command" - run "command" on thread #n. Use "thread apply all command" to run the command on every thread. ie. "thread apply all bt" shows all callstacks.

Others

  • "info sharedlibrary" - shows all loaded shared libraries
  • "list" - show the source file starting at the current line
  • "call function" - invoke a function within the debuggee process. ie. "call GetLastError()" will show the thread's lasterror value. GDB is not robust and may crash both the debugger and debuggee if somthing goes wrong in a "call", so use it only in scenarios where loss of the scenario is acceptable.
  • "p *(MethodDesc *)address", where the address is the 4 bytes stored directly in front of a jitted method, will print out the method name and other info
  • "setenv COMPlus_JitHalt=class::name" tells the JIT to insert a breakpoint at the start of the given method
  • "setenv COMPLus_JitTrace=1" generates a method trace

Useful Macros

Create or modify ~/.gdbinit to define some handy macros to make debugging easier:

define printw
call PAL_get_stderr()
call (void) PAL_fprintf($, "%S", $arg0)
echo \n
end
 
define du
printw $arg0
end
 
define pw
printw $arg0
end
 
define sos
call (void)SOS("$arg0")
echo \n
end

The printw, du, and pw macros are all aliases for a command which prints a PAL Unicode string out as text. PAL Unicode characters are 16-bit, which are encoded differently than Unix Unicode characters, which are 32-bit. The macros print the string to the debuggee's stderr.

The sos macro can be used to load and call a .so file which contains postmortem debugging tools for managed code, called SOS. For more information on this, see clr/src/vm/ceemain.cpp's SOS() function, and also the contents of clr/src/tools/sos.