Processor Cache Misses are Reported as Execution

When refactoring a serial application into a parallel application the minimum bar of acceptable performance is that the parallel application must run from beginning to end in less wall clock time than the serial application. If it doesn’t run faster, there’s no point in doing the refactoring. I was working on making a certain application…

0

Visualizing Concurrency on Production Systems

As interesting as profiling applications on your development computer is, I’m sure you’ve wanted to see the behavior of applications when running in production systems. The ability to view remote traces with the Concurrency Visualizer first requires installing the Visual Studio 2010 profiler on the production system. The installation executable can be found on the…

1

Concurrency Visualizer as a Microscope for Execution Dynamics

This is the picture that Concurrency Visualizer team used on the title page of internal specs. It actually reveals how most of us think about our product: not as a profiler (though you can get decent sample profile from it by clicking the green “Execution” category in the legend), and not even as a performance…

0

The Concurrency Visualizer Debuts with the Launch of VS2010!

Visual Studio 2010 is finally here and those of us who worked on the Concurrency Visualizer are thrilled about its debut (in VS2010 Premium and Ultimate)!  We truly hope that our hard work pays off for you, enabling you to solve your toughest parallel performance problems.  I’d like to thank everyone who has been reading…

0

General-Purpose Computation on Graphics Hardware

Those of you following the various parallel computing blogs from our team or who have played with Visual Studio 2010 have probably noticed a heavy focus on single box parallelism (and mostly on client machines).   For a future version of Visual Studio we are exploring stronger investment for the Windows HPC Server (for an…

0

Instantly Expanding Long Callstacks in the Concurrency Visualizer Reports

Did you notice that reports frequently show pretty long stacks?  E.g. in the picture below it is 12 frames deep, but sometimes it can be a hundred… and very frequently you need the deepest one to see what is going on. Are you supposed to click on “+” hundred times?  Happily, there is a shortcut…

0

Using the Concurrency Visualizer to Analyze MPI Communication Overheads

The Message Passing Interface (MPI) is a popular API for developing message-passing based parallel applications on clusters.  Microsoft has a Windows HPC Server product that includes an implementation of MPI, among other things (visit http://www.microsoft.com/hpc).  In this post, I’d like to demonstrate how the Visual Studio Concurrency Visualizer can be used to efficiently identify MPI…

0

Tuning a Parallel LINQ File Search Application

This post explores the performance issues that arise when using PLINQ to parallelize queries, and illustrates how the Concurrency Visualizer in Visual Studio 2010 can be a valuable tool in identifying performance bottlenecks and making efficient and profitable parallelization choices. The subject of this entry is a toy application that scans a set of files…

0

Tuning a Parallel Ray Tracer in F#

One of the samples that is included with the Parallel Programming Samples for .NET 4 is a simple Ray Tracer.  This ray tracer provides a nice visual way of seeing the benefits of .NET 4 parallelism features, as well as giving insights into the way work stealing happens under the hood.  The Problem This ray…

2

Adjusting Buffer Settings for Event Tracing for Windows (ETW)

We instrumented The Concurrency Visualizer within Visual Studio 2010’s profiler via Event Tracing for Windows (ETW), which depends on a number of buffers to cache data before writing it to disk. If The Concurrency Visualizer complains of lost kernel and / or user mode events during creation of a profile report, default settings for these…

8