Multi-processor builds in Orcas

Hey again, everyone.  I’m Peter-Michael, a program manager with the VC++ compiler team.  Continuing with the Orcas feature parade, I’d like to talk about the compiler’s team contribution to the product.  Back in June, Jonathan Caves posted about the compiler team’s plans for Orcas.  To recap, the compiler team is using this product cycle to get a head start on our next-generation compiler, the fruits of which we plan on delivering to you beginning with Orcas+1.  In the mean time, we have a smaller subset of the development team working on maintaining the current compiler until that glorious time comes where we can make the big switch over to our new technology.

In that time, we’ve fixed around 180 bugs in the compiler, 71 of those coming directly from MSDN connect feedback.  We’ve continued the standards conformance work that we undertook in VC 8 (as described by Andy Rich in his vcblog post) and fixed several outstanding bugs regarding the mixing of friend declarations with templates.  For example, the following code

template<class T> class A {

      friend void f(const A& a) { }

};

 

void bar(const A<int>& a) {

      f(a);

}

gave an error (cannot find declaration of f) in VC 8 when the instantiation of the template A should have made f(a) a valid call.  In Orcas, we’ve ensured that this and other friend-template and friend name-lookup issues have been resolved.

In addition to maintenance work, we found time to implement one feature that we hope you’ll find useful.  Build times always prove to be a major pain point for our customers, so we have done our best to try to provide technologies to mitigate the cost of building large projects such as the managed incremental build feature (formally called asmmeta) that Curt Carpenter recently blogged about.  With the growing presence of multi-core computers in both the developer and consumer space, exploiting that added power can help reduce build times significantly.  To that end, we’ve added a new switch to the compiler, /MP, to enable multi-process building at the source file level.

/MP works by exploiting the fact that translation units (a source file coupled with its includes) can be compiled independently of each other (up to link time where all object files need to be present).  Since we can compile each translation unit independently, we can parallelize the compilation by spawning multiple processes to handle a batch of source files.  This is precisely what /MP does; when you issue /MPn to the compiler, it spawns up to n processes to handle the source files passed to it.  The compiler is smart enough to spawn only as many processes as necessary, so if you specify three source files on the command-line and invoke the compiler with /MP4, then the compiler only spawns 3 processes.  By default, /MP takes n to be the number of effective processors on the machine.

The history of this switch is rather interesting to boot.  This switch was originally an intern project way back in Everett (VS 2003) but due to technical difficulties and resourcing issues, it never saw the light of day.  Some of our VC++ devs use the switch to build the compiler and see a 20-30% difference in time on a dual-proc machine.  Recently, we gave the switch to a customer in order to help with their build times and they too saw a large performance increase when they used /MP coupled with our current project build parallelization technology, on the order of a  30% gain.  After seeing this data and positive feedback, we decided that it would be prudent if we sat down and polished the switch now and get it out in Orcas.

You can preview /MP and the rest of VC++’s feature set in the March Visual Studio CTP, available as a Virtual PC Image or self-extracting install.  We hope that you enjoy your decreased build times!

Peter-Michael Osera – VC++ Compiler Team