AlwaysON - HADRON Learning Series: Maximum Failovers Within Specified Period

I can't take the credit for all this content as much of the investigation was done by Curt Mathews (SQL Server Escalation Engineer).

We are finding that folks want to test the failover abilities of AlwaysON but after a single failover it no longer seems to work. This is because of the default, cluster policy of "Maximum specified failovers in the specified period." This is a lengthy way of saying; avoid a ping-pong effect of the availability group.

For a new availability group the default is 1 failover within a 6 hour period. This may be exactly what you want to avoid a ping-pong effect of failovers in a production environment but it can be disconcerting when you break the seal on AlwaysON and failover and then try a second failover and it does not work. You may need to adjust the values to better suit your testing or production requirements.

image

Using the cluster.exe log /g you can review the failover activity. Shown below is an example of the cluster, log message indicating that the failover threshold was exceeded and failover won't be attempted at this time.

image

The following is a mock-up of what the Windows Event Log may provide in future operating system releases.

image

Bob Dorr - Principal SQL Server Escalation Engineer