SOA Optimistic Data Synchronization considered harmful

Let’s say that you have two systems: Adipose and BellyFat.  They both need the same information.  Adipose handles customer transactions, so it needs information about customers.  BellyFat handles the long-term management of customer information, like what products they have purchased and what rights they own.  It also needs information about customers.

How do we keep the customer information, in these two systems, in sync?  SOA offers three answers: Call-and-Response, Optimistic Data Sync and Pessimistic Data Sync.

  • Call-and-Response says: One system holds the data.  The other does not.  The system that does not hold the data calls the one that does, on a real time basis, and gets the data back quickly and efficiently.
     
  • Optimistic Data Sync says: Both systems keep a copy of the data.  If an event happens in either system, drop it on the bus.  The other system will get the event, interpret it correctly, and update its internal store to reflect the change.
     
  • Pessimistic Data Sync says: One system masters the data, but the other system keeps a local copy.  If an event happens in either system, drop it on the bus.  The other system will get the event, interpret it as best it can, and update its internal store according to its own business rules.  On a periodic basis, the ENTIRE data structure will be copied from the master to overwrite the data in the local copies (data refresh).

Each of these three has its own strengths and weaknesses.  One of them, Optimistic data sync, is so bad, however, that I’d like to call it out for special ridicule. 

Type Advantages Disadvantages
Call and Response
  • Any data entity exists once
  • Less duplication of data, lower physical storage
  • One view of the “truth”
  • Diminishes need for ETL capabilities
  • Easy understanding for software developers that are familiar with relational database design
  • Consistent inter-system data models not requireed
  • Constrains architecture – provider and consumer must be “close”
  • Requires highly available data providers
  • Cumulative negative impacts on overall ecosystem reliability
  • Requires highly scalable messaging infrastructure
  • Fosters point-to-point messaging
  • Leads to rapid increases in complexity and management cost
Optimistic Data Sync
  • Allows multiple systems to “master” a data entity
  • Diminishes need for ETL capabilities
  • Encourages loose coupling between systems
  • Supports systems that are wide distances apart. 

 

  • Requires highly scalable messaging infrastructure
  • Requires highly reliable messaging infrastructure
  • Assumes that data updates are always consistently applied
  • Data gradually gets out of sync, with no recourse to get it right.
  • Consistent data models are a necessity
Pessimistic Data Sync
  • Does not require expensive messaging infrastructure
  • The amount of “correctness” between systems can be carefully managed
  • Encourages loose coupling between systems
  • Consistent inter-system data models not required
  • Requires system architects to agree that one system is a master
  • Data gradually gets out of sync, but administrator can use refresh mechanism to resync
  • Requires ETL capabilities

 

You will choose one over the other depending on your tolerance for the “disadvantages” and your preference is for the “advantages” of any method.  However, and this is the focus of this blog post, one of these three is really not workable in the long run: Optimistic Data Synchronization.

The reason for my enmity for this approach is simple: this approach uses an underlying assumption that is poorly considered.  That assumption is that it is fairly easy for two systems to stay in sync simply by keeping each other abreast of all of the events that have occurred in either one.

The problem with this assumption is that it is NOT easy for two systems to stay in sync.  If the two systems don’t share an IDENTICAL data model, then each system has to interpret the messages from the other.  The rules of that interpretation have to be coded in each system, and that code must stay perfectly in sync.  Plus, there can be no errors in interpretation, or variations in the way that a change propagates throughout the recipient’s system.  There can be no variations in the event model between the systems.  No bugs either.  Sure…. if we can get to the point where no humans are involved in writing computer applications, then this might make sense.

Not long ago, I used to think that Optimistic data sync was a good idea, and that SOA developers should assume that their data exists outside their systems.  Heck, I used to think that call and response was a good idea too.  Both are bad in practice, with Optimistic sync being by far the worst.  There are just too many common scenarios (like one system going down for a period of time, and coming back up after missing some messages) that drives down the overall integrity of data managed in this way.

While I’d go so far as to say that Pessimistic and Call-and-Response are reasonable patterns for a SOA architect to select, the optimistic data sync method should be considered an anti-pattern, and avoided as much as humanly possible.