Some factors to be considered while setting up DB mirroring
Database Mirroring is a software solution for increasing database availability. Database mirroring offers a substantial improvement in availability over the level previously possible using Microsoft SQL Server and provides an easy-to-manage alternative or supplement to failover clustering or log shipping.
However, it is only supported with SQL Server 2005 SP1. If you are using SQL 2005 without SP1, you should not implement database mirroring on a production server http://support.microsoft.com/?kbid=907741
You need to consider the following five factors when setting up database mirroring in SQL Server environment. These factors not only affect the overall performance of the systems involved in database mirroring but also limit the number of databases that can be mirrored on the servers.
- Amount of RAM available on the principal and mirror servers (RAM).
- Processing power of the principal and mirror servers (CPU).
- Bandwidth available for the I/O subsystem on the principal and mirror servers
- Network bandwidth between the two servers.
- Volume of transactions on the principal database that generate transaction log records (i.e. transactions that change the database in any way).
Each of the above issues is discussed in detail below:
- RAM: For further information on how the available physical memory affects the performance of SQL Server refer to the following articles in Microsoft Knowledge Base.
- CPU: Database mirroring is limited by the number of threads available in SQL Server which in turn is directly dependent on the computing power of the machine. By default in SQL Server 2005 and above, the “max worker threads” is 0 which means a dynamic value. This value is automatically adjusted by SQL Server when the computing power changes. SQL Server uses the following formula when calculating max number of worker threads
For 64 bit operating system:
- Total available logical CPU’s <= 4 : max worker threads = 512
Total available logical CPU’s > 4 : max worker threads = 512 + ((logical CPUS’s - 4) * 16)
You can determine the amount of worker threads available on your SQL Server using the sys.dm_os_sys_info (column max_workers_count) dynamic management view (DMV).
You can determine the amount of worker threads used on your SQL Server using the sys.dm_os_schedulers (sum of rows in column current_workers_count) dynamic management view (DMV).
Database mirroring has the following requirements for worker threads
- Principal server: 1 global thread and 2 threads per each of the mirrored databases.
- Mirror server:
- 64 bit architectures: 1 global thread, 2 threads per mirrored databases and one additional thread for each mirrored database for every 4 processor cores.
- On a 4 CPU server it requires 1 global thread+ (2+ 1) threads /mirrored db.
On a 6-8 CPU server it requires 1 global thread + (2+2) threads /mirrored db.
- Witness Server: 2 global threads.
In summary database mirroring requires 2 threads on the principal server and at least 3 to 4 worker threads on the mirror server for each of the mirrored database in addition to 1 global thread on each of them. It is important to note that these calculations only reflect the number of worker threads required for administering database mirroring and additional threads are required for the core database server activity. If you put everything discussed above into perspective, on an x64 system with 32 CPUs we have:
- Max worker threads = 512+ (32 - 4) * 16 = 960
- Max number of mirror databases (The following calculation assumes that you mirror all the databases on one primary and one mirror server. Since the number of thread needed on mirror server are greater than those on the primary, we need to use the mirror server values in our calculations).
- # of threads required per mirrored database = (2+32/4) = 10 threads per database
- Max number of databases that can be mirrored = 960/10 = 96
However the full amount of 960 threads are not available to perform mirroring activities as some threads are assigned to core database activity on the server. From the above calculations it should be clear that
- The more activity you have on the server, the fewer will be the number of databases you are able to mirror.
- The more the number of mirror databases, the fewer will be the number of worker threads that are available to service core database requests and hence the lower will be the overall performance of the system.
As you can see this limit will even be lower on 32 bit systems, since the total number of worker threads available on the same kind of system is just 480.
- HDD IO: If transactions are not hardened on the log drive on the mirror fast enough and you are using high safety, the principal might have to wait for the mirror to acknowledge hardening of log records before transactions can commit, resulting in degraded performance. Also if you decide to mirror a database between two SQL Server instances on the same machine the amount of writes will double thus a disk queue might form (the ideal disk queue length is 0), SQL Server will have to wait for I/Os to complete before transactions can commit, and in extreme cases you might also notice events related to this issue being recorded inside SQL Server errorlogs :
2008-04-22 16:30:02.140 spid6s SQL Server has encountered 2 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [F:\sql data files\xxx.MDF] in database [xxx] (5).
You can determine the amount disk I/O bottlenecks on your SQL Server in the DMV sys.dm_os_schedulers (column pending_disk_io_count).
Tuning the I/O subsystem to allow increased throughput can help alleviate this (e.g. by using a RAID level that offers higher performance, separating databases onto separate RAID arrays), depending on the sophistication of the I/O subsystem. Also makesure that our system is not under memory pressure as this will generate added /O pressure. Also make sure that your system is not under memory pressure as his will generate added IO pressure.
- NETWORK: In order to decide whether the amount of activity is pushing the limits of network bandwidth you need to setup mirroring and evaluate the network bandwidth needs with different scenarios (high safety, high performance) and load. The following Performance Monitor counters can be used in order to establish if the bandwidth is enough or fully utilized:
- Network Interface Object.
- SQLServer:DatabaseMirroring Log Bytes Sent/sec counter on the principal database. This is the rate of log being sent.
- SQLServer:DatabaseMirroring Log Compressed Bytes Sent/sec counter on the principal database (New in 2008). This is compressed size bytes sent and is a subset of the Log Bytes Sent/sec counter.
- SQLServer:DatabaseMirroring Log Send Queue KB counter on the principal database. This is the amount of log yet to be sent to the mirror.
- SQLServer:DatabaseMirroring Log Harden Time (ms) counter on the mirror database (New in 2008). This is the time to write the received log to disk.
- SQLServer:DatabaseMirroring Log Bytes Received/sec counter on the mirror database. This is the rate of log beingreceived.
- SQLServer:DatabaseMirroring Log Compressed Bytes Rcvd/sec counter on the mirror database (New in 2008). This is compressed size bytes received and is a subset of the Log Bytes Received/sec counter.
- SQLServer:DatabaseMirroring Redo Bytes/sec counter on the mirror database. This is the rate that log is being redone.
- SQLServer:DatabaseMirroring Redo Queue KB counter on the mirror database. This is the amount of log that is yet to be redone
- SQLServer:DatabaseMirroring Send/Receive Ack Time counter on the mirror database.
- SQLServer:Databases Transactions/sec counter on the mirror database.
You can also find some more information on some of these performance counters in SQL Server 2005 technical article on database mirroring.
If the implementation of mirroring for testing purposes is not an option you can use one of the following methods to get an idea about the performance of your network.
- Use a third party tool like Iperf to evaluate your network's performance. The instructions to simulate database mirroring performance with this tool are as follows:
- To get help with this tool run:iperf --help
- On the server (principal) run:iperf -s -p5001 -fKB -w6553
- On the client (mirror) run:iperf -c <hostname or ip address of server> -d -t60 -w65536
- Verify that there is sufficient headroom (less than 66 percent utilized, as a rule) on both the CPU and DISK IO on the production server, and the same levels can be supported by the eventual mirror database server.
- Check the network latency using the ping command and explore ways to reduce the same when possible.
- Periodically check the value of is_send_flow_controlled column in the sys.dm_db_mirroring_connection dynamic management view (DMV). If you notice that this value is 1 most of the time, then it is a very good indication that the network latency is affecting mirroring throughput.
To mitigate the impact of network latency on the database mirroring throughput, you should consider the following:
- Gigabit network cards
- Dedicated NICs for mirroring.
- Using the 'Log stream compression' feature in SQL Server 2008 environments.(which is enabled by default). This will cause the mirroring traffic to be compressed before being sent over the network. Though this reduces networkbandwidth requirements for mirroring it will cause the CPU utilization to increase as it will spend time in compression and decompression, and possibly a heavier workload on the principal.
- DATABASE ACTIVITY: The amount of activity on the SQL Server affects the following:
- Number of worker threads that are currently in use (CPU).
- Network bandwidth (NETWORK).
- Amount of writes/reads on the storage media (HDDIO).
- Available memory on the system.(RAM)
As you can see the database activity affects all the core components that the database mirroring depends on and hence it is imperative that you factor this in to your planning when setting up database mirroring in your environment. It is further recommended that you minimize index maintenance activity on the databases that are mirrored by spreading the maintenance tasks over a number of days since it typically generates a backlog of log send queue on the principal database.
A quorum is the minimal relationship among all the connected servers required by a synchronous database mirroring session. Because at least two servers are required for a quorum, when the witness is set the principal server must form a quorum with at least one
other server to keep the database in service, regardless of the safety setting. Normally, if a witness is set, then the safety level is set to FULL as well.
The witness server assists the principal or mirror in forming a quorum. If a witness server is present, a loss of either the principal database or the mirror database leaves two servers to form a quorum. If the principal cannot see the mirror server, but it can form a quorum with the witness, it can keep its database in service. Similarly, if the mirror and witness servers cannot see the principal server, and the mirror server can form a quorum with the witness, the mirror can take on the role of a fw principal server.
The witness is not considered a single point of failure in a database mirroring session, because if the witness server fails, the principal and mirror continue to form a quorum
The way database mirroring figures out whether to initiate a failover is to have all SQL Server instances within a mirroring session send out 'pings' on each of its open mirroring connections every second. If an instance does not receive a ping from one of the other instances in the session within a defined time-out period (called the partner time-out), it considers the other instance as unavailable and takes the appropriate action. The default time-out period is 10 ten seconds.
As you know the default time-out is 10 seconds and the amount of latency between the servers in your case is less than a second you should not be seeing any false fail-over unless there is a network issue or the network latency is more than 10 seconds or if the server does not respond to ping request for more than 10 seconds.
Also, there is an option to increase the time-out value to any value higher than 10 seconds and to lower to any value less than 10 seconds limited to 5 seconds as the lowest. Below command is used to change the time-out value
ALTER DATABASE DBNAME SET PARTNER TIMEOUT 10;
Do not set the time-out value too low, otherwise you run the risk of triggering failovers because of false-failures - especially on heavily-loaded systems that may not be able to
respond within the time-out period.
Setting up Database Mirroring: http://msdn.microsoft.com/en-us/library/ms190941(v=SQL.90).aspx
Prerequisites and Recommendations for Database Mirroring : http://msdn.microsoft.com/en-us/library/ms366349(v=SQL.90).aspx
Troubleshooting Database Mirroring Setup: http://msdn.microsoft.com/en-us/library/ms189127(v=SQL.90).aspx
Database Mirroring Endpoint : http://msdn.microsoft.com/en-us/library/ms179511(v=SQL.90).aspx
Managing Metadata When Making a Database Available on Another Server Instance : http://msdn.microsoft.com/en-us/library/ms187580(v=SQL.90).aspx
Database Mirroring Best Practices and Performance Considerations : http://technet.microsoft.com/en-us/library/cc917681.aspx
Implementing Database Mirroring in SQL Server 2005 across domains: http://www.mssqltips.com/tip.asp?tip=1705
Database Mirroring and Failover Clustering: http://technet.microsoft.com/en-us/library/ms191309.aspx
Database Mirroring in SQL Server 2005: http://technet.microsoft.com/en-us/library/cc917680.aspx
Database Mirroring FAQ: http://technet.microsoft.com/en-us/cc984166.aspx
Database Mirroring Best Practices and Performance Considerations: http://technet.microsoft.com/en-us/library/cc917681.aspx
Although, there is a lot of information on Database Mirroring out there, but the following links provide the best information in the most consolidated format. Microsoft has published a great whitepaper on Database Mirroring which starts from implementation basics and end up with best practices recommendations.: http://www.microsoft.com/technet/prodtechnol/sql/2005/dbmirror.mspx#EGAA
For other useful articles on database mirroring, its implementation, maintenance and removal, refer to this best source of information at MSDN:http://msdn2.microsoft.com/en-US/library/ms177412.aspx
There is no official recommendation from Microsoft on maximum allowed latency for database mirroring, however there were some tests performed providing some guidance on the network configuration and latency. Please refer to below article - section: Network Configuration Best Practices which has some very good explanation with examples: http://technet.microsoft.com/en-us/library/cc917681.aspx#EGAA
You can monitor the database mirroring latency using the perfmon counters. Please refer below article which provides information about the perfmon counters available along with the description: http://msdn.microsoft.com/en-us/library/ms408393.aspx