AlwaysOn Availability Groups: Comparing On-Premises Deployments and Azure VM Deployments

Author: Sanjay Mishra

Contributors: Piyush Ranjan, Steven Schneider

Reviewers: Cephas Lin, Lori Clark, Shep Sheppard, Prem Mehra, Juergen Thomas, Luis Carlos Vargas Herring, Alexei Khalyako

 

AlwaysOn Availability Groups provides an effective solution for high availability for SQL Server in on-premises deployments as well as for Windows Azure VM deployments. Whether deployed on-premises or in Windows Azure VM, AlwaysOn Availability Groups fundamentally works the same way. However, if you are familiar with on-premises deployments of AlwaysOn Availability Groups, you may find some differences while deploying in Azure VMs, due to the differences in the way clustering and networking infrastructure works in Azure. In this article we call out these differences.

Need for a Domain Controller

Whether deploying on-premises or in Azure, AlwaysOn AG requires a domain controller (for the Windows Cluster). The difference is that, for on-premises deployments, you very likely have domain controllers, Active Directory and DNS already deployed. If you are deploying AlwaysOn AG in Azure for the first time, or creating a new Azure environment for application, you will need to setup the domain controller, Active Directory, DNS, etc.

Creating the Windows Server Failover Cluster

A Windows Server Failover Cluster (WSFC) is the foundation for the AlwaysOn AG. Therefore, you need a WSFC whether you are deploying on-premises or in Azure. The difference is how to create the cluster. It is common to use the Failover Cluster Manager GUI to create the cluster for on-premises deployments. If you use the Failover Cluster Manager GUI to create the cluster in Azure, you may run into a few issues and the cluster may not come online. The reason is that Azure's DHCP assigns a duplicate IP to the cluster network name (CNN), and that can cause cluster communication issues. AlwaysOn AG doesn't use the CNN, therefore the workaround is to temporarily assign a link-local IP (e.g. 169.254.1.1) to the cluster network name during the cluster creation. These tasks have been wrapped in a Powershell script available for download (https://gallery.technet.microsoft.com/scriptcenter/Create-WSFC-Cluster-for-7c207d3a). Use this script to create the WSFC needed for AlwaysOn AG in Azure.

The [NT AUTHORITY\SYSTEM] Account

If you have used one of the gallery images for creating the Azure SQL Server VMs, you may find that the SQL Server login for [NT AUTHORITY\SYSTEM] doesn't exist, or the account lacks the necessary permissions to create the AlwaysOn Availability Group. Potentially, this may happen in on-premises deployments as well, depending upon your particular SQL Server install, however, we have seen this happen more often in Azure deployments. The solution is described in the KB article: https://support.microsoft.com/kb/2847723, and is integrated into the Tutorial (https://msdn.microsoft.com/en-us/library/jj870963.aspx).

The SQL Server Service Account

If you have used one of the gallery images for creating the Azure SQL Server VMs, you may find that the SQL Server service is running under a built-in account [NT Service\MSSQLSERVER]. You may like to change the SQL Server service account to an appropriate domain account. This is optional, however, be aware that if the SQL Server service account is a built-in account, you will need to use certificates for endpoint authentication (https://technet.microsoft.com/en-us/library/ff878308.aspx#Accounts).

Availability Group Listener

Availability Group Listener is implemented differently in Azure compared to on-premises. In Azure the AG Listener is implemented using the Cloud Service. The VMs participating in an AG are in a cloud service – the cloud service has a public name and a public IP address. The process for creating the AG Listener in Azure (Tutorial: https://msdn.microsoft.com/en-us/library/dn376546.aspx) is very different from how you create the AG Listener for on-premises deployments. Due to the differences in implementation, AG Listener in Azure has some limitations compared to on-premises:

Only One AG per Cloud Service, if using AG Listener

Since the AG Listener is implemented using the cloud service, you can have only one AG per set of VMs participating in a cloud service. For on-premises deployments, some customers deploy multiple AGs across the same set of machines.

Figure 1: Possible on-premises AG configuration whether using AG Listener or not

 

Figure 2: Windows Azure AG configuration when using AG Listener

Can't connect to the AG Listener from the AG secondary VMs

Because the AG Listener uses DirectServerReturn on the cloud service endpoint, you can't connect to the AG Listener from the VM(s) in the AG secondary role. In Figure 2, if Machine1 is AG primary and Machine2 is AG secondary, you will get a successful connection to the AG Listener from machine1:

C:\Machine1>hostname

Machine1

 

C:\Users\sqlsvc> sqlcmd -E -S Listener_1

1> select @@servername

2> go

 

------------------------------------------------

Machine1

 

But you will get a connection failure if you attempt to connect to the AG Listener from Machine2:

C:\Machine2>hostname

Machine2

 

C:\Machine2>sqlcmd -E -S Listener_1

Sqlcmd: Error: Microsoft ODBC Driver 11 for SQL Server: TCP Provider: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

Sqlcmd: Error: Microsoft ODBC Driver 11 for SQL Server: Login timeout expired.

. . .

If using ACL on the AG Listener Endpoint, the AG Listener port number must be different from the SQL Server instance endpoint port number

Since the AG Listener uses a public endpoint on the cloud service, it is strongly recommended to use ACL (https://msdn.microsoft.com/en-us/library/windowsazure/dn376541.aspx) on the endpoint.

Using ACL has a side effect on how you specify port numbers for the SQL Server instances and for the AG Listener endpoint. For on-premises deployments, you could specify the same port number for the SQL Server instances participating in an AG, as well as for the AG Listener (https://blogs.msdn.com/b/sqlcat/archive/2014/02/03/alwayson-availability-groups-listener-named-instances-port-numbers-etc.aspx). However, when using ACL on an AG Listener in Azure, that is not an option. The port number for the SQL Server endpoint, and the port number used for the AG Listener must be different.