Parallel builds scenarios and implementation ideas

The MSBuild Team has started thinking about adding multi-proc support to the MSBuild engine. Currently MSBuild is single-threaded and does not take advantage of any opportunities for parallel processing during a build. However, most builds inherently have chunks of work that can be done concurrently. By parallelizing these independent chunks of work we aim to reduce the total build time for all builds, large and small.

To enable parallel builds, we believe we need to support the following scenarios:

  1. Ability to build a large tree of projects in parallel – we call this the “build lab” scenario. In this scenario multiple projects are built concurrently.
  2. Ability to build a Visual Studio solution in parallel. Solutions in Visual Studio can contain multiple projects, and even for simple solutions we should be able to support parallel builds. In this scenario, the independent projects are built concurrently, while the dependent ones are serialized.
  3. Ability to build multiple files in a project concurrently. Some projects have lots of files that need to be operated on by a single tool one-by-one. In this scenario, the tool is invoked concurrently on all the files.

We currently believe that scenarios (1) and (2) would provide the “biggest bang for the buck” in build performance. However, MSBuild is a general build orchestration engine and we know some of our customers build things other than source code. For many of these customers, supporting (3) would offer them significant benefits. This is particularly the case if the build calls into older tools that don’t support multi-processor machines natively. As a result, we believe we also need to look into supporting the parallelization of tasks within a build.

In addition to thinking about scenarios, we're working on implementation designs (as you can see from the above picture!). On the implementation side the build mechanism we are considering is completely declarative. The MSBuild engine will not make any attempt to determine what can be parallelized. Instead, we will instead introduce parallelization “constructs” in the file format that will allow a project or targets file author to indicate what can and cannot be parallelized.

For Visual Studio solutions, the parallelization will be automatic in a sense. The target files we will ship will allow all independent projects in a solution to be parallelized. As long as the dependencies are correctly expressed through project-to-project references, or through the “Project Dependencies” dialog, the build will parallelize correctly.

Now is your chance to give us feedback! What do your builds look like? How would the above three points of parallelization improve your build performance? What do you think about our plans to introduce new construcuts? Let us know by replying to this post, or send us email at msbuild@microsoft.com.

[ Author: Sumedh Kanetkar ]