An IronPython Profiler

Article
03/30/2009

I was recently working on improving the performance of a Python-based application and thought it would be useful to profile my code. My intuition was telling me that the problem was in a certain area of the code, but I wanted hard numbers to back up this intuition before I went to the trouble of doing a rewrite. Unfortunately, neither the CLR Profiler nor the profiling support in Visual Studio are very effective when used with IronPython, as they weren’t really created for environments where code is dynamically compiled at runtime.

But Tomáš had written a profiler for IronRuby nearly half a year ago, so I thought I’d port this to IronPython, where it’s now experimentally available in IronPython 2.6 Alpha 1.

The way Tomáš’ profiler works is pretty straightforward. At the start of every method we compile, a snapshot is taken of the current time. Another is taken at the end of the method and the difference is stored for later retrieval. The only tricky thing about it is trying to update the statistics in a manner that’s both thread-safe and has minimal effect on performance.

I’ve modified it a bit from the original, though, and added some fun features. In particular, it now tracks the number of calls made to each method and also keeps separate counters for the amount of time spent exclusively inside the method versus the duration spent inclusively in both this method and in any methods it calls.

Additionally, I’ve added counters to track calls made from Python code into statically-compiled .NET code.

Here’s a sample program profile.py that demonstrates the statistics we’re gathering by printing them on exit.

 import atexit
import clr
from System.Threading import Thread

def delay(seconds):
    Thread.CurrentThread.Join(1000 * seconds)

def print_profile():
    for p in clr.GetProfilerData():
        print '%s\t%d\t%d\t%d' % (p.Name, p.InclusiveTime, p.ExclusiveTime, p.Calls)

atexit.register(print_profile)
delay(1)

The profiler isn’t enabled by default, as it does have a small impact on performance. You need to opt into it by running IronPython with the flag “-X:EnableProfiler”. If you use this flag with the above code, you’ll get the following output:

Name	Inclusive	Exclusive	Calls
module site	61440	61440	1
module profile: def delay(seconds)	10721609	44	1
module profile: def print_profile()	20	20	1
module profile	23827771	169190	1
type Builtin: method: __import__(CodeContext, String, Object, Object, Object, Int32)	9372193	9372193	4
module atexit: def _run_exitfuncs()	268844	69133	1
module atexit: def register(func, targs, kargs)	271181	4728	1
module atexit	6635215	113043	1
type Builtin: method: hasattr(CodeContext, Object, String)	4336493	4336493	1
type List: method: append(Object)	14677	14677	1
type Thread: method: get_CurrentThread()	83	83	1
type Thread: method: Join(Int32)	10077140	10077140	1
type List: method: pop()	17810	17810	1
type ClrModule: method: GetProfilerData(CodeContext)	0	0	1

These records are returned in the order that they were compiled, not the order that they were first called – except that methods which were never called are omitted entirely.

A record with the name “module site” means the top-level import of the site module.

Other records that start with the name “module” refer to Python code.

Records that start with “type” refer to .NET code that is called directly from IronPython.

Times are recorded in “Ticks”, which are increments of 100ns. That means that the 10077140 ticks spent in the Thread.Join method are actually 1.007714 second.

Here’s what this profiler doesn’t do a good job of recording: the time required to parse Python, compile it into expression trees, generate the IL and JIT the IL. Some of this is accounted for in the time it takes to import an external module, but not in a way that provides visibility to the constituent parts.

We’d love to get your feedback on this experimental addition to IronPython. Happy profiling!

An IronPython Profiler

Additional resources