Massively Parallel Applications

When discussing the evolution of software, I have spent some time discussing genetic algorithms. When trying to think 10 or 20 years into the future, these algorithms provide us with a proven way to explore a large problem domain (caused by the combinatorial nature of possible solutions to a software problem) with much greater efficiency than an exhaustive search (which may be entirely unfeasible). I believe that this is going to grow to be more and more important over time, as the problems we are hoping to solve have less and less obvious solutions.

An area of much more immediate concern is how to mold software to the shift in focus from extremely fast processors to many processors.

A brief digression: Back in the 1990’s, I received an email from a colleage who was working on an application that was having performance issues. This was a Visual Basic (6.0 at the time) developer who did not have academic training as a developer. This individual had heard that C++ enabled developing multi-threaded applications, and that multi-threaded applications were objectively “better” than the single-threaded applications that were possible in VB. So, this person contacted me for some “quick” information on how to port the application to C++ and make it multi-threaded, because this would certainly solve the problem quickly and effectively without having to spend all of that time revisiting the architecture and the algorighms used. Unfortunately, it really wasn’t that easy. The application was being run on a single processor machine (even dual processor machines were comparatively rare at the time), and the problem wasn’t that the thread was blocked – the problem was that it was solving the problem in a computationally intensive way. Adding multiple threads would only exacerbate the situation – in addition to a fully loaded CPU, multiple threads would add the additional resource pressure of context switching. I told this person to leave it as a single threaded application and look for other ways to optimize, since in this scenario parallel processing was not going to offer any benefits.

This continued to be true for quite some time, and the applications we have on our desktops today certainly reflect that philosophy. I am writing this on a single-processor machine. At home, I have one computer with two virtual processors (hyperthreading), but the rest have a single processor. Since software is dictated in many ways by the hardware used by both the customer and the developer, massively parallel applications are comparatively rare.

But single processor machines are starting to approach a limit. My 3.6 GHz Pentium 4 has a gigantic heat sink sitting on top that whisks heat up using both copper and some kind of compressed fluid to an array of fins that a fan blows directly over the top of, and I still get a blinking processor temperature warning light from time to time. I already don’t have to invest a lot of money heating my office, because my computer does a good enough job of that on its own. I really don’t want processors running faster and hotter – I just want them running faster. That requires additional processors. We are already seeing dual core processors.

AMD Dual-Core Processor

I think we can safele expect the trend to continue in this direction.

Again, this process takes its cue from biology. The human brain is able to solve incredibly complex problems. It is estimated that it can process on the order of 2 * 1016 computations per second. It is able to compute at this level without generating extraordinary amounts of heat becaues it is massively parallel – the computation is distributed throughout the brain, and not run through a single “super” neuronal computation facility. The up side is that you can scratch your head without receiving third degree burns. The down side is that you have to invest in tools such as a stove or a microwave, rather than just popping food in your mouth and thinking really hard in order to cook it.

The massively parallel nature of the human brain allows it to do some very impressive pattern recognition tasks – tasks which we have yet to fully replicate on computer hardware because it does not support the same number of cumputations per second. However, in the next few decades, we will most likely reach this level of computation, and we will most likely achieve this through creating our applications to be massively parallel.

The fact remains, however, that creating multi-threaded applications is really hard to do well. In the 1.0 and 1.1 versions of the .NET Framework, you had the option of either creating threads on your own, or leveraging the managed thread pool (which you could do fairly easily using delegates, and not even have to be aware of the existence of the pool). The 2.0 version of the framework introduces the BackgroundWorker class to make it even easier.

Side note: from a user experience point of view, if you are developing Windows Forms applications, you should be intimately familiar with the options for running on a separate thread, as blocking the primary UI thread to do some long-running work is really bad form.

There is still room to evolve the platform, however. We still have to publish best practices and educate people about deadlocks and race conditions.. There is not a common platform for developing and parallelizing algorithms to distribute across hardware resources. (Microsoft Research is working on a project called Dryad whose goal is to do exactly that.)

In the interim, I think it behoves every developer to begin considering the impact of the migration from one really fact processor core to many processor cores. In my opinion, this is going to be a hugely important skill in the coming years, and it is already relevant today.

Comments (3)

  1. richasingh says:

    I want to know what application will face WRP, UAC and session 0 issues on Vista!

  2. cjacks says:

    I’m not sure what you are looking for here. The application compatibility toolkit can help you spot these issues. The Setup Analysis tool and Windows Vista Compatibility Evaluator will spot WRP issues. Standard User Analyzer and the UAC Compatibility Evaluator will spot UAC issues. Application Verifier can help you identify Session 0 issues.