Azure Feature – SQL Data Sync


This article is pulled from my website: http://42base13.net/ and article can also be read here: http://42base13.net/azure-feature-sql-data-sync/

Introduction

I have been using Azure for many years and have a cloud service that supports my global prayer app named Thoughts and Prayers (WP, W8, iOS, Android).  I created this years ago before Azure Mobile Services were created and made my own WCF service to use across all platforms.  With the new Universal apps, WCF support is not available on the Windows Phone 8.1 apps.  So I need to move over to Azure Mobile Services to support an update of the app on Windows Phone.  The biggest issue with this is that my original service and database are located in the North Central US region of Azure.  This region does not have Azure Mobile Services, so I have to move things over to one of the other regions to support it and I had to find a way to move my LIVE database and not affect the 22k+ users of the apps.  The answer for this is the new SQL Data Sync feature that is in preview on Azure.

Azure SQL Data Sync

Taking a look at this new feature in Azure, it allows you to define a Sync group and allows you to have data sync to other SQL Azure instances or even to on-premises SQL Server databases using a Sync Agent.  Since I was needing to move my database from the North Central to East region, I did not need to use the SQL Agent and created Sync Group to do what was needed.

So going into the SQL Database section of the older Azure portal, you will see at the bottom of the screen to Add Sync.

syncgroup

In the Sync Group you will designate one database as a Hub or main database and then the others that get added will sync to the Hub.  Start creating the group by giving it a name and a region that the Data Sync will live in.

syncgroup1

The next step will be where you define the Hub or main database.  You will need to add a username and password for the Hub database.  You also have to select how conflicts will be resolved.  You can choose one of the following:

Hub Wins – any change written to the Hub database is written to the reference databases, overwriting changes in the same reference database record.

Client Wins – changes written to the hub are overwritten by changes in reference databases, so the last change written to the hub is the one kept and propagated to the other databases.

syncgroup2

The final page of the Data Sync wizard lets you add a reference database.  This will be the database that you can sync to or from.  Pick the database from the drop down.  This means that it needs to be a database on this account, or it could be a Sync Agent for an on-premises database.  The usual username and password for this database will be stored as well as the Sync Direction.  For the Sync Direction, you can be from the following:

Bi-Directional – changes from either the Hub or this database are synced to the other one.

Sync to the Hub – all changes from this database will be synced to the Hub.

Sync from the Hub – all changes from the Hub will be synced to this database.

The Data Sync system will use the Conflict Resolution setting from the previous page to determine what data wins and conflict.  This means that you need to be careful about the possibility that data could get lost.

syncgroup3

Now we have the Data Sync all setup, on to configuring how often it should perform the sync operation and what should sync.  So right now we have the service setup but it is not setup to be run automatically or on a schedule.

Configuring the Sync

You will now see the Sync (preview) on the top menu of the SQL Database section in Azure.  From this menu option, you can now configure the Sync feature to tell it how often and what fields to sync.

The first section is the references.  This shows the main Hub database as well as all of the references.  On the bottom you will find a button to add another reference database to this sync.  For this example, I will keep it as a single database to sync with.  This display also shows the direction of the data flow in the sync.

syncconfig1

The next menu option for the Sync feature is Configure.  This will allow you to set the feature to perform Automatic Sync operations and allow you to set the frequency.  You can setup the frequency to be in minutes, hours, days, or months.  Keep in mind that the sync operation is going to take some time, especially if the database is larger.  Setting the frequency to be 1 minute might be too often, but this is part of a business decision for your data.

syncconfig2

The next step is to setup the Sync Rules.  This is where you will determine what fields will be synced, and what fields will not be synced.  This shows up as a tree structure and has checkboxes for each field to select them.  I wanted to show my data fields because there is a field that is not supported in the data sync.  It seems that the automatic timestamp field for the records are not available for the sync since it is an automated field.  This is a mobile services database and the version field is part of how those are configured.

syncconfig3

The last two menu items are the properties and logs for the data sync.  These are not needed to be setup for the sync to start working.  At the bottom of the screen in this configuration, there are also two more buttons, Sync and Stop.  The Sync will perform a sync operation right now and the Stop will stop one in progress.

Conclusion

The new Data Sync feature is a nice way to let you make sure that copies of your database are updated on a schedule.  Data is important for your business and for your customers.  There are a few things to keep in mind, the databases that you see are the ones from your own account.  It does looks like it will support going to a different subscription under a single account, but you cannot sync to a totally different account.  The data sync is not instant and it will take time to perform the operation, but this new preview feature is a great way to move a database, backup, and syncing with other Azure and on-premises databases.  I am using it for my personal projects and it is something that has a lot of potential for many different databases.

 

Comments (4)

  1. KP1111 says:

    I thought Azure Data Sync was coming to an end or is this a new version?

  2. dlcarp says:

    I set up a sync between an Azure Database and a SQL Server database. The Azure db is an exact copy of the SQL Server db. Each database contains 6 tables. The table with the largest number of records (400,000) has very few columns. After initial setup of the sync group I clicked the "sync" button and after 15 hours I cancelled the sync. I thought that I had set something up incorrectly so I blew away everything and started over only to experience the same results. Has anyone else run into the inordinate amount of time it takes to do a simple sync of two databases?

  3. Alan Parr says:

    @KP1111 – I think you might be thinking of SQL Federations, that was discotinued last year.

    @dlcarp – When creating the sync group for the first time, I've found it best to have one of the databases be empty with all the necessary tables, just no data in them. Then when sync starts, it can just insert everything as a new record from one database to the other.

    If both databases have data, it has to decide for each row whether or not it is a conflict. This takes a considerable amount of time, I have seen it take days for databases not much larger than yours.

  4. Michael Oberhardt says:

    Just a bit disappointing to see this still in a preview format after so many years, and fundamentally no better.

    The inability to check progress of the sync is unforgivable. You have no idea what is going on, where it is up to, etc. It isn't like sql replication where you have decent visibility, not to mention realistic filtering options. All you see is a useless "Progressing…" status.

    If you are looking at a better way of getting SQL server syncing to azure, try SQL Server transactional replication. The subscriber database in SQL Azure has to be treated as read only, but it is there. Makes it easier to get the database there and syncing, and then eventually cutover to it as the primary database when desired.