Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
When we were developing the current version of MSBuild, we spent a lot of time analyzing builds to determine where our performance issues lay. The standard logging, even on diagnostic verbosity and with the performance summary enabled (/clp:PerformanceSummary=true on the MSBuild command line) doesn’t give us the kind of information we desired. What we were looking for was a quick way to visualize which projects depended on which other projects, how much time they were taking, and how MSBuild was allocating the work to its worker nodes. Enter the Detailed Build Summary.
Before looking at these graphs, it is important to understand how MSBuild performs its work. When a project is being built, all of its tasks are executed on a worker node. There is always at least one node, present in the MSBuild executable launched from the command line or inside the Visual Studio IDE. This is referred to as the in-proc node. There may be additional nodes created out-of-proc, and are called the out-of-proc or multi-proc nodes. In the case where MSBuild has more projects to build than currently existing nodes, it will create new worker nodes up to the limit specified by the /m switch.
Each node may have any number of projects assigned to it, but only one project at a time will be executing tasks (though there an exception with the Yield mechanism which I won’t get in to here.)
You can enable the Detailed Build Summary for any build by passing the /ds or /detailedsummary switch to MSBuild. This will cause MSBuild to compile and log this information at the end of the log file. For this blog I whipped up a little build tree, built using four MSBuild worker nodes (/m:4), and turned on the log
F:\Bugs\MultiProc>msbuild /tv:4.0 cir1.proj /m:4 /ds /t:ParallelBuild > foo.txt The end of foo.txt after the build… Detailed Build Summary ====================== ============================== Build Hierarchy (IDs represent configurations) ===================================================== Id : Exclusive Time Total Time Path (Targets) ----------------------------------------------------------------------------------------------------------------------------------- 0 : 0.036s 0.295s F:\Bugs\MultiProc\cir1.proj (ParallelBuild) | 1 : 0.047s 0.047s F:\Bugs\MultiProc\cir2.proj () | 3 : 0.209s 0.209s F:\Bugs\MultiProc\cir4.proj () | 2 : 0.210s 0.210s F:\Bugs\MultiProc\cir3.proj () | 4 : 0.212s 0.217s F:\Bugs\MultiProc\cir5.proj () | | 10 : 0.005s 0.005s F:\Bugs\MultiProc\cir8.proj () | . 11 : 0.003s 0.003s F:\Bugs\MultiProc\cir9.proj () | 5 : 0.001s 0.001s F:\Bugs\MultiProc\cir6.proj () . 6 : 0.003s 0.015s F:\Bugs\MultiProc\cir7.proj () | . 7 : 0.004s 0.012s F:\Bugs\MultiProc\cir5.proj () | | | 8 : 0.007s 0.007s F:\Bugs\MultiProc\cir8.proj () | | . 9 : 0.001s 0.001s F:\Bugs\MultiProc\cir9.proj () ============================== Node Utilization (IDs represent configurations) ==================================================== Timestamp: 1 2 3 4 Duration Cumulative ----------------------------------------------------------------------------------------------------------------------------------- 634032540333349050: 0 x x x 0.035s 0.035s 634032540333699050: 1 x x x 0.041s 0.076s 634032540334109050: | 4 2 3 0.006s 0.082s 634032540334169050: 5 | | | 0.001s 0.083s 634032540334179050: 6 | | | 0.002s 0.085s 634032540334199050: 7 | | | 0.002s 0.087s 634032540334219050: 8 | | | 0.007s 0.094s 634032540334289050: 9 | | | 0.001s 0.095s 634032540334299050: 7 | | | 0.002s 0.097s 634032540334319050: 6 | | | 0.001s 0.098s 634032540334329050: x | | | 0.188s 0.286s ### 634032540336209050: x x 10 11 0.003s 0.289s 634032540336239050: x x | x 0.002s 0.291s 634032540336259050: x 4 x x 0.003s 0.294s 634032540336289050: 0 x x x 0.002s 0.296s ----------------------------------------------------------------------------------------------------------------------------------- Utilization: 33.8 96.8 97.7 96.8 Average Utilization: 81.3 Build succeeded. 0 Warning(s) 0 Error(s) Time Elapsed 00:00:00.32 |
The data is split up into two sections – the Build Hierarchy and the Node Utilization. I’ll explain them in order.
The hierarchy section shows all of the projects that were built. Each of the columns contains the following information:
To the left and below each ID number is the tree layout. Each pipe ‘|’ symbol means that the request to the right of it is a dependency (that is, the request above depends on the request to the right directly.) The more pipes, the deeper the dependency tree. A period ‘.’ symbol means that the ID to the left is the last dependency request for the parent (which is the ID above the symbol in that column) which was actually built. So in the above, 0 is the root request and it depends on 1, 3, 2, 4, 5 and 6 directly. Request 6 depends on 7, which depends on 8 and 9. 6 can be said to indirectly depend on 8 and 9 through request 7.
The utilization section shows how MSBuild has allocated requests to build using the nodes available. The columns have the following meanings:
Using this information, we can see that request 0 is first assigned to node 1. It then immediately cedes control to request 1 (one of its dependencies). If we look at request 0, we can see it depends on many other requests. But they are not scheduled at this event because MSBuild creates nodes dynamically, so the other nodes weren’t available yet. The next event we see requests 2, 3 and 4 are all scheduled because those nodes have become available. During this period request 1 continues to execute on node 1. This proceeds for a while until we get down toward the bottom. We can see eventually we run out of work to schedule on node 1, leaving 2, 3 and 4 to keep executing. Request 4 depends on requests 10 and 11, and we can see the point at which node 2 suspends executing request 4 – this is where request 4 must wait on its dependencies. Once those have finished (building on nodes 3 and 4 in this case), request 4 resumes on node 2. By that point it is the last request which request 0 was waiting on. Once it is finished, request 0 resumes on node 1, and then finishes.
At the end, we display some utilization numbers, which tells how well we loaded the worker nodes. In a 100% perfectly parallelizable build, we would like to see 100% for all of those utilizations. In practice this rarely occurs because builds tend to have places where they are more serialized (if you have a common library which takes a long time to build, you will see this.) If you look at your graph and you see one node doing work while no others are and the total duration of that period is long, then that is an indication you have serialization in your build and it may be worth looking at whether that request really should be that long – can it be split up into smaller chunks and have other requests refer to it piecemeal? Can the project itself be made to build faster using better tools? Is the request doing something unexpected?
Another thing you can experiment with when trying to tune your builds is changing the multi-proc node count limit. For instance, sometimes setting the /m number to one more or one less than the number of actual cores you have will enhance scheduling. If you are performing C++ builds, you may also play with the /MP setting on the compiler which enables it to parallelize the processing of C++ files directly.
MSBuild currently uses some heuristics to determine when projects should be scheduled to build. This is especially important when there are more outstanding build requests to schedule than there are nodes to work on them. So even if you do manage to eliminate all serialization, the theoretical minimal build time might not be achieved because we lack the information to make the right decisions. This is an area we are currently working on, and we hope to bring you even more improved build times in the future.
Here in MSBuild we are very motivated to provide improved build analysis tools and intrinsically smarter build systems. The Detailed Build Summary diagnostic output in MSBuild 4 can provide some useful information about how MSBuild sees and builds your projects. Using it you can determine the actual project dependencies and relative build times of all of your projects. This information can then be used to better organize your projects for the purposes of more efficient builds. This functionality only scratches the surface of what is necessary for analyzing more complex builds, but rest assured we are hard at work bringing you those tools.
If you find this information useful, and especially if you decide to actually parse the output of this functionality, let us know. We certainly intend to improve the mechanism, but if it ends up being useful as-is we will want to try to keep it stable so that future releases don’t break your code.
Cliff Hudson
Visual Studio Platform
MSBuild Developer
Please sign in to use this experience.
Sign in