AlwaysOn Availability Groups Generate Network Traffic with No User Activity

 

Author: Kevin Cox

Contributors:  Min He,Steve Lindell

Reviewers: Sanjay Mishra, Juergen Thomas, Jimmy May

 

Creating an availability group needs pings and status checks across the different servers involved. This accounts for approximately 500 bytes per database in the group.  The PerfMon counter used to track the activity is “Bytes Received from Replica/sec”.  Books Online has recently been changed to reflect a high level description.  The purpose of this blog is to provide a bit more detail.

Books Online   https://msdn.microsoft.com/en-us/library/ff878472.aspx

Counter Name

Description

Bytes Received from Replica/sec

Number of   bytes received from the availability replica per second. Pings and status updates will generate network traffic even on databases with no user updates.

 

The key to understanding this counter is that the traffic is per database.  On one customer project with 40 databases in one availability group, this counter was showing about 8k/sec when there was no user traffic.  This consists of bi-directional
ping and status checks.

What does the primary do with this data it gets back from pings and status checks?  It displays the information on the Always On Dashboard. There are two main sources for the dashboard data. 

  1. sys.dm_hadr_database_replica_states.  This runs every second unless the primary is too busy, then it will run as soon as it can. This is used to display the information from the latest status checks, but does not drive any network traffic.
  2. sp_server_diagnostics. It is called by the Availability Group (AG) Resource DLL, which is only run on the AG primary node and only connects to the local instance.  It does not contribute to the network traffic mentioned above, but is useful to know that it is running periodically according to the HEALTH_CHECK_TIMEOUT setting.   All the Availability Groups on one instance share a single resource DLL.

Hopefully, with this information you will be able to understand why the Bytes Received from Replica/sec counter is showing activity even when there is no user activity.