Building High Performance, Highly Available SQL Servers on Azure


Editor’s note: The following post was written by SQL Server MVP Warner Chaves as part of our Technical Tuesday series.

The cloud is ready for SQL Server enterprise-grade workloads. This is not my opinion but a simple fact that comes from helping multiple clients in different industries move very intensive SQL Servers to Azure. A couple of years ago the guidance was to focus on moving your Development or Test servers but now a days with proper planning and configuration you can deploy a production SQL Server confidently and easily.

In this article I’m going to focus on the two pillars of a SQL Server deployment: performance and availability. Of course we want our SQL Servers to be fast but we also want them to be highly available and ready to serve applications any time of the day, any day of the year. And we want to do these things in a cost efficient way. I will discuss the different options Azure offers to achieve these goals.

Building for Performance
The exact virtual machine size that you’ll need depends on your SQL Server size and load, however there are some best practices and recommendations that apply to any VM that you want to optimize for performance.

First, for high performance workloads you want to be looking at VM sizes for Microsoft Azure Virtual Machines. In my opinion, for high performance workloads the options are either a DS machine or a G series machine. The pick between the two of them right now comes down to differences in the CPU models, the amount of RAM per core and the type of storage allowed as we’ll see below.

Compute and Memory
On the compute front, the DS machines come with Intel Xeon E5-2660 2.2 GHz processors that are 60% faster than the previous CPU models used on the A tier. The G ones however come equipped with more powerful Intel Xeon E5-2698 2.3 GHz processors for more compute demanding workloads. Most SQL Server workloads are more IO bound than CPU bound but if yours doesn’t follow this rule and is heavier on CPU then a G series model could be better.

The amount of RAM per core also changes between the DS and G series with the G series coming equipped with more RAM per core. For example, the following 3 machines all come with 4 cores but differ based on the RAM provided:
• DS3: 14GB of RAM.
• DS12: 28GB of RAM.
• G2: 56GB of RAM.

The more RAM the more expensive the machine is, so you need to make a choice that makes sense for the size of your databases and your workload. The pricing details for the VM sizes can be found on pricing page, here.

Storage
Both DS and G series come with a temporary SSD drive attached to the VM with a size dependent on the specific VM model you choose. The G series come with larger temporary SSD drives at a higher cost. Since this drive is temporary it should only be used for storing a database like Tempdb on SQL Server. Or if using SQL Server 2014 or up, you can deploy a Buffer Pool Extension file on this drive.

For permanent storage there are two options: Page Blob storage and the recently introduced SSD Premium Storage.  Page blob storage volumes provide performance of approximately 500 IOPS, up to 60 MB/sec and variable latency. Depending on your VM model you’ll be able to attach a variable number of volumes to it as well. For example, a D3 4 core machine allows attaching up to 8 of these volumes, whereas a 32 core G5 allows up to 64. Using Windows Server Storage Spaces you can also stripe these volumes to provide higher throughput to a single Windows disk. Page blob storage is paid by the amount used and the amount of IO transactions.

Premium storage is more recent and right now only available on the DS series VMs. This type of storage is SSD-based and can sustain higher IOPS and throughput with lower latency than the classic Page Blob storage. Premium volumes come in 3 different performance levels (July 2015):

The amount of these volumes that you can attach to a single VM goes from 2 on a DS1 all the way to 32 on a DS14. You can stripe these as well to present them as one disk, though keep in mind that specific VM sizes will have a limit to the amount of IOPS and MB/sec that they can go up to. You can see those limits here: https://azure.microsoft.com/en-us/documentation/articles/storage-premium-storage-preview-portal/.
Premium storage unfortunately is not yet available for all regions so this could be a big factor in your decision. At the time I’m writing this (July 2015) these are the regions where you can create a DS series machine: West US, East US 2, West Europe, South East Asia, Japan West. Microsoft is constantly adding more capabilities to each region so make sure to check the Azure portal for the latest information.
Also note that the published Best Practices from Microsoft for data disks is to have no caching for Page Blob disks and Read-Only caching for Premium Storage disks. Refer to this article for the full details: https://msdn.microsoft.com/en-us/library/azure/dn133149.aspx.
Putting all the information together, here are two example Virtual Machine configurations:

Configuring for Availability
SQL Server includes several High Availability and Disaster Recovery solutions right out-of-the-box that work well within Azure and provide different levels of resilience to suit different Recovery Time Objective (RTO) and Recovery Point Objective (RPO) requirements. These solutions are log shipping, database mirroring and AlwaysOn availability groups.

Regarding storage redundancy, locally redundant storage should be used so Azure will keep 3 copies of your Virtual Hard Disks. Geo-redundant storage should not be used for SQL Server because write-ordering is not guaranteed with this option. For geographic redundancy it’s recommended to use a SQL Server technology like the ones mentioned.

For new Enterprise grade deployments the best solution is to go with SQL Server 2014 Enterprise and AlwaysOn Availability Groups. For example, support for multi-subnet clusters in Windows Server 2012 and above means we can deploy two nodes to provide high availability on one Azure region and then another third node for disaster recovery on a second region.

One concept that is critical to understand in Azure is Availability Sets. An Availability Set is a logical grouping of virtual machines to maximize availability in the event of planned or unplanned downtime of the Azure physical host. Virtual machines inside an Availability Set are assigned and Update Domain and a Fault Domain and these govern the rules as to where the Virtual Machine is located in case of planned or unplanned maintenance. For example, if we have two SQL Servers in a Windows cluster, we can have them in different Update and Fault domains so that if planned or unplanned maintenance happens for one machine then the other one will not be affected and will be able to take over.

This is what the solution would look like: 


 
This is a 3 node Windows Server 2012 Failover Cluster called SQLCLUSTER running SQL Server 2014 Enterprise and using AlwaysOn Availability Groups to provide the redundancy and data synchronization capabilities. The Primary site has two nodes and a file-share witness that are part of the same Availability Set for fast local failover. There is also a third node that on a second Azure region that serves as the Disaster Recovery location. SQL1 replicates synchronously to SQL2 and provides automatic failover while SQL3 is being replicated to asynchronously and is able to do a manual failover if a disaster strikes.

Final thoughts
Azure has now matured to the point where critical SQL Server workloads can be designed and implemented on the platform with ease. Both performance and availability requirements can be met with the latest offerings like Premium Storage and SQL Server technologies like AlwaysOn Availability Groups.

The key to a successful deployment is in documenting the performance and availability requirements clearly and then comparing against the different virtual machine and configuration options mentioned in this article. If your organization is thinking of leveraging the cloud for efficiency and velocity, SQL Server can definitely go there and you can make sure that it does so without compromising any performance or availability.

About the author

Warner is a SQL Server MCM and SQL Server Principal Consultant at Pythian, a global Canada-based company specialized in data and infrastructure services. A brief stint in .NET programming led to his early DBA formation working for enterprise customers in Hewlett-Packard ITO organization. From there he transitioned to his current position at Pythian, managing multiple customers and instances in many versions and industries while leading a highly talented team of SQL Server DBAs.

Comments (7)

  1. Nathan Chase says:

    What's the difference in price/performance now in running SQL Server in a VM vs. utilizing SQL Azure Database (the cloud-based SQL)?

  2. Daniel says:

    Can you speak to cases where this makes sense over PaaS SQL offerings?  I have a hard time explaining to some customers how to make the decision between IaaS and PaaS for their SQL workloads.

  3. Ilya Geller says:

    SQL, Structured Query Language obtains and uses patterns from EXTERNAL queries and statistics on how often they are used; neither the queries, nor patterns, nor statistics have anything to do with data itself.

    I, however, discovered and patented how to structure any data without SQL, the queries - INTERNALLY: Language has its own INTERNAL parsing, indexing and statistics and can be structured INTERNALLY. (For more details please browse on my name ‘Ilya Geller’.)

    For instance, there are two sentences:

    a) 'Sam!’

    b) 'A loud ringing of one of the bells was followed by the appearance of a smart chambermaid in the upper sleeping gallery, who, after tapping at one of the doors, and receiving a request from within, called over the balustrades -'Sam!'.'

    Evidently, that the 'Sam' has different importance into both sentences, in regard to extra information in both. This distinction is reflected as the phrases, which contain 'Sam', weights: the first has 1, the second – 0.08; the greater weight signifies stronger emotional ‘acuteness’; where the weight refers to the frequency that a phrase occurs in relation to other phrases.

    SQL cannot produce the above statistics and has no my other novelties – SQL depicts data from outside, and no portrait can create an ideal representation of its object - SQL Industry standard is obsolete and out of business.

  4. Chris Marisic says:

    Needs more pictures (serious comment)

  5. warCHE says:

    @Nathan and @Daniel: the Azure SQL DB service is getting closer and closer to the on-the-box SQL Server but remember you're paying per database and not per instance. So you have to do the math and figure out if it makes sense for whatever amount of databases that you already have on your SQL instance. And feature parity is not 100% (yet) so there might be some show stoppers that you're using that won't allow usign the PaaS offering.

    Finally, if you do a test of running your db on the P3 performance level and it's still not good enough then you will need to build your own SQL Server anyway.

  6. warCHE says:

    Hi, this is the author. I just want to give a couple of quick updates for people that might find this article now (September 2015).

    1) Azure SQL DB has gone up to P11 service tier so there are more workloads that can be moved to PaaS.

    2) Elastic pools are out now on preview, again, this fits the "instance" model on-premises and worth investigating.

    3) There's a new GS series of machines now in Preview that mixes the CPU and RAM power of the G series with Premium storage. This wasn't out when I wrote the article.

    Cheers!

    Warner

Skip to main content