Which is better a sample or instrumentation based profiler?

One of my passions is teaching performance and it is interesting to see students not realize the difference between instrumentation based profilers and sample based profilers.  In fact you might be wondering what the difference is now.  If you are keep reading.

There are a couple of classes or profilers and instrumentation;

Sample based - These are profilers that run without modifying the binaries they gather measurementation on.  Typically time (cpu cycles) are used to determine when to take a sample.  A sample can be a stack trace, current performance counters, or anything else for that matter.  Sampling can also be done on events other than time like system calls, L2 cache miss, ...  When the preset value of time expires or the set number of L2 cache misses happens a sample is taken.

Instrumentation based profilers - These profilers will modify your binary typically around function entry / exit to gather call stacks.  Just like sample based profilers other information can be collected as well.

Global - Refers to tools that look at the overall performance of systems like the OS + application.

Code level - Refers to profilers that just look at a particular application.  This is what is provided with Visual Studio Team Suite Profilers.

When to use sample vs instrumenation based profilers:

Sample based - Good for CPU bound problems,

Instrumentation based profilers - Good for I/O, idle wait, memory, ...

Which tools exist for Windows?

Sample based profilers - Visual Studio Team Suite Profilers (aka F1), CLRProfiler, PerfMon, ETW (see logman.exe to control), ...

Instrumentation based profilers - Visual Studio Team Suite Profilers (aka F1)

 VTune from Intel and AMD CodeAnalyst are also good profilers. 

Word of caution...

Before diving into Visual Studio Team Suite or other code level profilers you should really start with a gobal performance tool to make sure the problem is in your code.  All too often people spend days profiling their code to find the perfomance issue somewhere else in the system due to an unexpected interaction.  Try tools like Perfmon and ETW (future post) first.

So which is better?

I think the answer really depends on the type of performance issue you are trying to solve.  If most of your problems are with CPU then a sample based profiler is the way to go but you are unlikely to find an I/O issue.

Happy perf hunting...

  Tony