Hi, my name is Kurt Friedrich and I have been the Product Unit Manager for Clustering and High Availability for about 6 years. Previously, I worked at Digital Equipment on VAXclusters and also at Tandem on their clusters. I have always been a believer that computing services should never go down, and besides, I really enjoy the technical challenges of keeping multiple systems in perfect state synchronization. I joined Microsoft just in time for the final quality cycles for Server 2003 and the rest of this time we have been building the best release of clusters and NLB ever, Windows Server 2008. The team is excited and pleased with the large improvements we have made, and we sure hope our customers will be too.
The prior releases of clustering, what we commonly called “MSCS”, were targeted towards larger enterprise companies with a dedicated staff of specialist to install and operate MSCS clusters. While there have been some product issues over the prior releases, I think it is quite accurate to say that the Server 2003 R2 release has been an extremely solid platform that is delivering high availability to thousands of installations.
Because of the recent rapid growth of Windows Compute Clusters, calling our HA product “server clusters” was causing confusion. So with Sever 2008, we are now calling HA clusters “WSFC”, (Windows Server Failover Clustering).
Our primary goal for WSFC as part of the 2008 release was to extend the benefits of HA to a much broader market. By making the process for hardware selection, the software installation, and the operation of clusters considerably easier, our goal has been that any general system administrator, in any mid-size company, that is capable of installing Windows Server, will be able to take advantage of WSFC with confidence. No specialized training should be required. This new approach, in conjunction with the many lower cost hardware options now available, should pave the way for mid-size companies to get the HA benefits that only large companies previously enjoyed. I encourage all companies to think about the total cost in lost productivity and customer dissatisfaction incurred from a 1 hour down time of your systems. Generally, that amount is considerably more than the incremental cost of clustering.
Other team members will be adding to this blog with more details of the many improvements, but allow me to just introduce a few.
Clusters are now very easy to create. To create a 2 node cluster, we have gone from 23 screens down to 7 screens, and some of those are just read-only informational. What you actually have to do is enter the names of systems you want to join into the cluster on one screen, enter a name for the cluster on another screen, and that is it. The cluster is created for you. It is just that simple to build a 16 node cluster as well. And yes, I did say 16, as we have increased the number of clusterable nodes from 8 to 16 in this release.
Once you have built the cluster, it is much easier to manage as well. Management is now done from an MMC snap-in that has the same look and feel as the many other MMC Server snap-ins, so you don’t have to learn something new for clusters. With previous releases of clustering, creating a clustered file server with file shares was a bit of a challenge. It required creating a “group”, several “resources” within the group (File Share IP Address, File Share Access Name, Shared Volume), assigning possible owners for all of these objects, specifying the dependency tree for these objects, bringing the group on-line, and then creating another “resource” (MSCS Share), and defining a path for it. So what is a “group”? A “resource”? Who should be possible owners? What is a dependency tree and what is the correct form for it?
Did you really want to learn all that and specify that? We decided the answer is no! So with Windows Server 2008, here is what you do instead. You run the “I want to create a file server” wizard. It asks you for the name your users will use to access the file server, you select the disks you want to use from a list of unassigned disks, and you are done. That’s it! Then using the normal File Explorer, you can right click on any folder contained on those disks, and just like a Vista system, you can make it a shared folder. Only now it is a highly available file share!
The last item I want to mention with any detail is the new model for selecting hardware configurations on which Microsoft will support clustering. With Windows Server 2003, systems are only supported by Microsoft if one of our partners has specified the complete configuration in great detail (such as, down to the firmware version of the HBAs), run an exhaustive, long running set of tests on that exact configuration, supplied proof of this to Microsoft, and then these were published on a list of all such systems in the Windows Catalog. I won’t go in to the long list of difficulties that this has caused for both our partners and our customers, but the list is long. One of the most serious ones is, assume you bought a 2-node cluster in April 07, and in November of 07 you decided you wanted to expand it to become a 3 node cluster. However the particular CPU system you ordered in April was no longer offered for sale, as your vendor had replaced it with a newer, faster model. Or maybe another vendor was offering a terrific sale price on a new CPU and you wanted to add one to your cluster. Well, these actions would often make that cluster “non-compliant”, as no such “mixed” 3-node system is listed in the Windows Catalog.
New in Windows Server 2008 is a Validation feature that is included in the new cluster management MMC snap-in. One of the key uses of this new tool is to pre-validate a hardware configuration. Before you create a cluster, you will run this tool and specify which nodes you intend to cluster. This tool will verify a very long list of hardware and software requirements and configuration settings. This is very useful since we have seen about half of the cluster problem calls into Microsoft PSS were due to configurations issues. The new tool will zero in on problems and allow you to correct the situation quickly, often without requiring any support from your hardware supplier or Microsoft. But also, even if you have mixed hardware from different vendors, if the tool passes, then Microsoft will offer software support for that cluster.
As I said, we will follow up with more notes about other features, such as how the quorum disk is no longer a single point of failure, new IPv6 support, and Multi-Site Cluster enhancements. We will also have a blog about the improvements in NLB, which include IPv6 support, usage of the new high performance TCP protocol stack, and the new NLB Manager.
We are pleased with this release and looking forward to even more improvements in subsequent releases. Certainly we will continue to be focusing on ease-of-use and we are doing exciting new work in conjunction with the Hyper-V team. We are striving to have more customer focused designs and to achieve this, we will be developing a stronger customer connection. We want to know how we can make a better failover cluster product for you! Click here to send us your feedback or the ‘Email’ link to the right.