Merge Replication Capacity Planning


 

Merge Replication Capacity Planning

Chris Skorlinski
Microsoft SQL Server Escalation Services

I was recently asked for recommendations for Merge Replication capacity planning project.  While I spend most of my time tuning and troubleshooting Replication environments, I decided to post some thoughts on about stress testing and capacity planning.  I hope this posting helps as you are designing your Merge Replication environment. 

Background Reading

Merge Replication Performance Tuning and Optimization (bit out dated, but still a good read)
http://technet.microsoft.com/en-us/library/cc966386.aspx

Diagnosing and Troubleshooting Slow Partitioned Merge Processes (better article)
http://technet.microsoft.com/en-us/library/cc966387.aspx

Merge Replication Performance Improvements in SQL Server 2005
http://technet.microsoft.com/en-us/library/cc966388.aspx

Merge Replication Performance and Scalability Cheat Sheet  (great list of best practices)
http://robtiffany.com/tag/merge-replication
http://robtiffany.com/sql-server-compact/meap-mobile-merge-replication-performance-and-scalability-cheat-sheet

Collection of Merge Replication Best Practices (my posting)
http://blogs.msdn.com/b/repltalk/archive/2010/02/23/collection-of-merge-replication-best-practices.aspx

Okay, now we got some of the background reading out of the way.  Let me say I’ve see a wide range of Merge topologies.  Everything from 15,000 users load balanced across multiple servers to 1 subscriber as a disaster recovery site.  I seen simple designs with little or no filtering supporting 1000 of users to very complex filtering/join design barely able to hand 10 subscribers.  Thankfully Merge Replication allows you to scale out through alternate Publishers (sync partners) as business needs grow. While this doesn’t correct overly complex designs, it does provide relief.

Merge Replication fits a very specific need, that is updates being performance across multiple SQL Servers then “blended” together into one comprehensive view of the data.  Key point is Merge Replication uses queries to pull the changed data and update the SQL servers.  If these queries are overly complex, or moving millions of rows of data, performance will suffer.  Merge can only work as fast as SQL can query the changed data.  The complexity comes when the static or dynamic filtering logic is so complex Merge is having to analyze 5-level deep parent-child relationships just to determine who gets what data. This is why “filtering” is mentioned in very “best practices” posting.

So, great, how does this help me with capacity planning?  Keep filter logic as simple as possible to reduce logic overhead.  Then develop a test strategy based on realistic data volumes and simulate simultaneous users synchronizing their changes.  If your Publisher can’t keep up, look to limit the number of concurrent users.  (see below).  This may cause some users to wait for a brief time, but you’ll get a realistic synchronization times.

Merge Processes (http://technet.microsoft.com/en-us/library/ms178668(SQL.90).aspx)
Limit concurrent processes

Determines whether to limit the number of Merge Agents that can run at the same time. This is typically used if a publication has a large number of push subscriptions that might be synchronizing simultaneously.

Maximum concurrent processes

The maximum number of Merge Agents that can run at the same time (requires Limit concurrent processes). If the number of agents synchronizing exceeds the maximum, agents are put in a queue until the number drops below the maximum.

You can also query the Merge Replication history tracking tables in the Distribution database to get a quantitative measurement of how long each synchronization took to complete and how much data was moved (Ins/Upd/Del) for each published tables.  Again, push the simulation until you see a bottleneck.  My last customer has 900 users synchronizing against 3 publishers moving about 5000 data changes per user synchronizing twice a day.  The multiple publishers provided both load balancing and redundancy should one publisher be offline for maintenance.

Tracking Merge Agent Performance using Distribution database system tables
http://blogs.msdn.com/b/repltalk/archive/2010/03/16/tracking-merge-agent-performance-using-distribution-database-system-tables.aspx

Can you get 300 users synchronizing 5000 data changes every hour on 1 publisher?  Sorry, if you read this far expecting some magic formula to plug in # users, # tables, # updates, # CPUs, amount of RAM, etc, because I don’t have any. So much depends on your specific environment such as CPU, Disk, Memory, Network (all the classic bottlenecks), in addition to filter logic and data volumes.  You’ll only know your particular environment works by developing a test plan then push your Replication topology until bottleneck occurs. 


Comments (3)

  1. Mike DeLitta says:

    I am running SQL Server 2005 SP3 on a single Windows Server 2003 Enterprise computer running a merge replication single database.  It functions as the Publisher and Distributor.  We have 65 remote users with push subscriptions.  I think we are starting to see a slowdown.  The computer is getting old.  We have a new computer we want to put into production, but we do not want to decommission the current one – as we do not want to push 65 subscriptions.  Do you have any reading we can do that we could somehow (if it is possible) to share the workload between the two computers?  Maybe half sync to one and the other half sync to the other.  Is is possible to push subscriptions from two different computers for the same database with merge replication?

    Thanks in advance.

    Mike.DeLitta@AxiomLLC.com

  2. Very Impressive Posting says:

    Hi Chris Skorlinski ,

    We are looking for the same expert for scaling & Capacity calibration of our production server, we are using Merge Replication with Filter Publications And Datapartions, we are currently facing heavy deadlocks while we implementing the Merge in new subscribers(i.e. Adding new subscribers to the production box). We presently were handling 80 subscribers with 2 hrs sync timing placing across the nation with different time zones.

    So we are interested looking forward for your help in scaling our system. Please let me know we are open to talk our MS officer how always helps us in raising the support case with MS.

    Please drop me mail or please call me to my number if you are willing to look into this issue. we are willing to raise the support call with MS.

    We appreciate your support!!!

    My mail id is krishnaj1@michaels.com

    Contact # 214 208 8944

    Regards

    Jayakrishna

  3. Dinesh Vishe says:

    Nice