Availability Group Listener in Windows Azure Now Supported! (And Scripts for Cloud-Only Configuration)

If you haven't noticed, AlwaysOn Availability Groups in Windows Azure now supports availability group (AG) listeners. Configuring the AG listener in Windows Azure is a multi-step process that includes:

  1. Creating a load-balanced endpoint for the Windows Azure VMs that host AG replicas.
  2. Ensuring that hotfix KB2854082 is installed on all cluster nodes (including any node that does not host an AG replica).
  3. Opening the probe port in the cluster nodes that host AG replicas.
  4. Manually creating a client access point in the cluster and configuring the IP address parameters to use the cloud service IP address, the probe port, etc.
  5. Setting the listener port in SQL Server.

If you are looking for an easier way to configure the listener in Windows Azure, I've published a script at the Script Center. This script currently has limited applications, but hopefully I can expand the scenarios as time goes on – and if you shout in my ear. If your scenario fits all the requirements, then I hope this script can help simplify the process of listener creation. If you don't fit all the requirements, you may still be able to "scriptify" most of the steps. So just read on.

Now back to the requirements for this script, the biggest limitations are as follows:

  • All AG nodes are running in Windows Azure and in the same subnet – Simply put, if the same listener requires multiple IPs, you can't use this script. This means the script excludes to all multi-subnet scenarios such as hybrid IT.

  • All cluster VMs were created with PowerShell Remoting enabled – This part gets a little hairy, so get ready. If after GA (4/16) you created your cluster VMs using PowerShell, then they are all PowerShell Remoting enabled. If however, you created your VMs using the portal, you had a choice until very recently to enable PowerShell Remoting by means of a small check box. If you didn't check that box, I won't say you lose, but you definitely can't use this script, at least not without manually enabling PowerShell Remoting on your VMs and tweaking the script. My personal opinion is: not worth the trouble.

    Now, the Azure team make a small tweak on 7/16 that enables PowerShell Remoting for all portal-created VMs without giving you the option. So if you created your VM after 7/16, then you win!

So enough for the $winners, now for the -not($winners) – those who can't use the script because of the above limitations. I'd like to provide some PowerShell snippets that you can run and that hopefully can make things simpler as well. Mainly, there are three scripts: one to run on your local client from which you normally administer your Windows Azure deployment, one to run on all your cluster VMs, and one to run on the primary replica VM. Understand that these scripts are much more "quick and dirty" than foolproof, so don't expect the validation and error checking that you find in the downloadable script. Also, you should have created a working AG in Windows Azure before using these steps. So now, without further ado, here are steps:

  1. Windows Azure PowerShell June 2013 or later installed on the local client. Download at https://go.microsoft.com/?linkid=9811175&clcid=0x409.

  2. On your local client, copy and paste the following script into a Windows Azure PowerShell session to configure LB endpoints and DSR each AG node (not necessarily each WSFC node)

    # Define variables

    $AGNodes = "VM1","VM2","VM3" # all AG nodes should be included

    $ServiceName = "MyCloudService" # the name of the cloud service that contains the AG nodes

    $EndpointName = "MyEndpoint" # name of the endpoint

    $EndpointPort = "10000" # public and private port to use for the endpoint

     

    # Configure a load balanced endpoint for each node in $AGNodes, with direct server return enabled

    ForEach ($node in $AGNodes)

    {

       Get-AzureVM -ServiceName $ServiceName -Name $node | Add-AzureEndpoint -Name $EndpointName -Protocol "TCP" -PublicPort $EndpointPort -LocalPort $EndpointPort -LBSetName "$EndpointName-LB" -ProbePort 59999 -ProbeProtocol "TCP" -DirectServerReturn $true | Update-AzureVM

    }

  3. Connect to RDP session for each WSFC node and download hotfix KB2854082 to a local directory.

  4. In the RDP session for each WSFC node, copy and paste the following script into an elevated PowerShell session to install hotfix 2854082 and open the probe port in the firewall if the node is an availability group node. Be careful to run the script to completion on each node before moving onto the next node.

    # Define variables

    $ag = "AG1" #Availability group name

    $hotfixpath = "<localpath>" #Hotfix's .msu file path

    $listenerPort = "10000" # Listener port. Same as the endpoint port.

     

    Import-Module FailoverClusters

    # Stop the cluster service

    Stop-ClusterNode -Name $env:COMPUTERNAME

    # Run update

    & $hotfixpath /quiet

    # Wait for update to finish

    while ((Get-Process | where {$_.ProcessName -eq "wusa"}) -ne $null)

    {

       Write-Host "Waiting for update to complete..."

       Start-Sleep -Seconds 10

    }

    # Start the cluster service

    Start-ClusterNode -Name $env:COMPUTERNAME

    # Check if VM is an AG node and open the port port if true

    If (Get-ClusterOwnerNode -Group $ag | where {$_.OwnerNodes -contains $env:COMPUTERNAME})

    {

       netsh advfirewall firewall add rule name='Load Balance Probe (TCP-In)' localport=59999 dir=in action=allow protocol=TCP

       netsh advfirewall firewall add rule name='Availability Group Listener (TCP-In)' localport=$listenerPort dir=in action=allow protocol=TCP

    }  

  5. In the RDP session for the primary replica VM, copy and paste the following script into an elevated PowerShell session

    # Define variables

    $ag = "AG1" # The availability group name

    $serviceName = "MyCloudService" # The cloud service name

    $networkName = "Cluster Network 1" # The cluster network name, usually "Cluster Network 1" if the nodes are in the same subnet

    $listenerPort = "10000" # Listener port. Same as the endpoint port.

     

    $aglistener = $ag + "Listener"

    $agendpoint = (Resolve-DnsName -Name "$serviceName.cloudapp.net").IPAddress

    Import-Module FailoverClusters

    # Add IP address resource for the listener to AG resource group. The probe port is set so the AG owner node will respond to probes from Windows Azure.

    Add-ClusterResource "IP Address $agendpoint" -ResourceType "IP Address" -Group $ag | Set-ClusterParameter -Multiple @{"Address"="$agendpoint";"ProbePort"="59999";SubnetMask="255.255.255.255";"Network"="$networkName";"OverrideAddressMatch"=1;"EnableDhcp"=0}

    # Add Network Name resource for the listener to AG resource group

    Add-ClusterResource -Name $aglistener -ResourceType "Network Name" -Group $ag | Set-ClusterParameter -Multiple @{"Name"=$aglistener;"DnsName"=$aglistener}

    # Set dependency for the Network Name resource on the IP address resource

    Get-ClusterResource -Name $aglistener | Set-ClusterResourceDependency "[IP Address $agendpoint]"

    # Start the listener resource

    Start-ClusterResource -Name $aglistener

    # Set dependency for the AG resource group on the listener's network name

    Get-ClusterResource -Name $ag | Set-ClusterResourceDependency "[$aglistener]"

    # Change port number on the listener to 1433

    Set-SqlAvailabilityGroupListener -Path SQLSERVER:\SQL\$env:COMPUTERNAME\DEFAULT\AvailabilityGroups\$ag\AvailabilityGroupListeners\$aglistener -Port $listenerPort

  6. Test connection to listener from a domain-joined VM that is not in the same cloud service (DSR not supported from within the same cloud service). Use a longer login timeout since network messages are traversing the VM's public endpoint. You can use sqlcmd or SSMS.

    sqlcmd –S "<ListenerName>,<EndpointPort>" -d "<DatabaseName>" -Q "select @@servername, db_name()" -l 15

  7. Fail over the AG and test the listener connection again. The query above should succeed and return a different server name.

The main HADR topic for SQL Server in Windows Azure: High Availability and Disaster Recovery for SQL Server in Windows Azure Virtual Machines, will be updated in the near future and will include a GUI tutorial on how to configure this.