Processor Cache Misses are Reported as Execution

When refactoring a serial application into a parallel application the minimum bar of acceptable performance is that the parallel application must run from beginning to end in less wall clock time than the serial application. If it doesn’t run faster, there’s no point in doing the refactoring. I was working on making a certain application…


Concurrency Visualizer as a Microscope for Execution Dynamics

This is the picture that Concurrency Visualizer team used on the title page of internal specs. It actually reveals how most of us think about our product: not as a profiler (though you can get decent sample profile from it by clicking the green “Execution” category in the legend), and not even as a performance…


Adjusting Buffer Settings for Event Tracing for Windows (ETW)

We instrumented The Concurrency Visualizer within Visual Studio 2010’s profiler via Event Tracing for Windows (ETW), which depends on a number of buffers to cache data before writing it to disk. If The Concurrency Visualizer complains of lost kernel and / or user mode events during creation of a profile report, default settings for these…

8

Parallel Performance Case Study: Finding References to Parallel Extensions

Stephen Toub                                                                                                       Parallel Computing Platform Visual Studio 2010 is quite a large application, comprising not only the entire integrated development environment (IDE) and all of the tools that make it up, but also the underlying runtimes and frameworks on which it runs, including the .NET Framework 4. When logic in one of these constituent components…


Overview of the Parallel Dwarfs project on Codeplex

The Parallel Motifs, or Parallel Dwarfs as they are sometimes called, are a collection of algorithm families that are important to parallel computing as they are known to be compute bound. More computational cycles – if applied judiciously – will result in faster execution on a given data set for many algorithm instances from each…


Linking Visualization to Application Phases

It is often necessary to divide many real-world applications into multiple distinct phases.  As a result, the search for performance problems can often be constrained to a particular phase of an application’s execution.  One of the most effective ways to narrow focus onto a specific region is to use the Scenario API on Code Gallery…


Oversubscription: a Classic Parallel Performance Problem

One of the most important things to pay attention to when tuning a multithreaded application is its performance pattern.  There is a set of common poor performance patterns that most developers of multithreaded applications will encounter.  These include, among other things, patterns such as oversubscription, serialization, lock convoys, and uneven workload distribution.  We have documented…

2

Learning to Write in Parallel

When I first joined the Concurrency Visualizer team, I thought, “Wouldn’t it be cool to make an application that could write messages on the Threads view?” I wasn’t alone, many of us had thought about it. It turns out to be not so hard. James wrote about the threads view three weeks ago, so refresh…

1

Green Isn’t Always Good

One reason to use the Concurrency Visualizer is to maximally utilize system resources. To aid in this effort, it displays the execution of the program as green segments in its timeline. However, the Visualizer does not distinguish between the user’s work and any other work in the process, so seeing a lot of green doesn’t…