Service Boundaries, Business Services and Data


 One of the core tenets of SOA is that the services have a clearly defined boundary. Services explicitly choose to expose certain (business) functions outside the boundary. Another tenet of SOA is that a service – with in its boundary – owns, encapsulates and protect its private data. But, there are not too many guidelines out there that talks about what should be exposed and how the boundaries should be drawn.

 


Keep an eye on ‘Data cohesion’ when you are designing the service boundary. I will give you an example. A while back, I was involved in architecting a product that has many different modules – like sales, finance, inventory, CRM, billing etc. Our natural instinct was to define each of these modules as a ‘Service’. So, we went ahead and defined a Sales service, CRM service, Finance service etc. We defined the entities that are required for each one of these apps and designed the database schema etc. Also, following SOA’s design principles, we defined the contracts for the services and exposed different functionalities. Finally, we used web services as a implementation mechanism to expose the services. At the end of the day, we thought we had a pretty good design.


 


But, things started falling apart the moment we started integrating different services. The main problem we faced was that the ‘Customer’ entity is required by all the systems. This entity can be updated by all the services. So, we had to implement a nasty ‘merge replication’ like infrastructure over web-services in order to keep the customer data in sync. To avoid building merge-replication infrastructure across service boundaries, we decided to remove the so called common entities (customer, product etc) and created a separate module or service for them – conveniently named common module. The idea here is that the common module provides services that abstracts the common entities and exposes operations for CRUD behavior (I know…CRUD is a bad word). There is another way to do that as well. For each entity, designate a service as a owner (a.k.a., System of record). You can think of Common module as a system of record for the common entities. So, in theory, both the approaches are equivalent.


 


So, we went down the path of building a common module (system of record) for common entities. Things worked for a while, before we got into implementing a scenario that actually required a cross-join between product entity and a related entity owned by order management service. Mind you, this is really not a business intelligence scenario. This is a simple search screen that should go to database, do the search and show the results for end-user to pick an item. Performance will go down the tube, if you implement a cross-join between entities owned by different services.  So, you have to build EII (Enterprise Information Integration) like solutions, which will most probably replicate related entities, so that joins can be done efficiently in database. And, we all know that building an EII solution is a costly affair.


 


So, what is the solution? Here is a practical one. “Try to share the database between related services.“


 


But, ‘wait a minute’, you may ask – ‘you are saying something heretic. Services are not supposed to share data between them’. Well, you are right. But, remember the problems created by simply adhering to core tenets of services. One of the problem with the core tenet of services is that the boundary tenet doesn’t talk about data, it only talks about what should or should not be exposed. Since it doesn’t talk about data, it is very easy draw service boundaries that may create data related problems later on.


 


If we still want to remain a purist, we have two choices: 1) re-interpret/relax service boundary rule 2) define a new type of service called business service. I believe option 2 is probably the best path.


 


Think of business service as a software system that automates one or more business capabilities. A business service may expose one or more services that correspond to each business capability. But, all those services share a logical data model. In other words, boundary is defined at the business service level and not at the individual service level. This is an important lesson we learnt.


 


For example, think of your favorite ERP system (SAP, Peoplesoft etc) as a business service that exposes one more services – one for HR management, one for Inventory, one for Order management, one for Accounting etc. But, they all share single logical data model. Ever wondered, why big ERP systems such as SAP have a single database that contains the tables required by all the modules? I tend to think this is the reason why they did it. But, I may be wrong.


 


So, here is the summary of key practical learnings


 



  1. Define boundary at the business service level.
  2. Business service may contain one or more services, but share data.
  3. If you have legacy systems that you have to integrate, then you can’t do much. Business services and their boundaries have been defined for you.

 

Comments (16)

  1. Marcus says:

    Great post… I’ve been looking for direction on this for some time time now.

  2. P Reddy says:

    Excellent post,

    short and sweet, and touches many aspects. Specially for an aspiring architect like myself this is an excellent source of information, Keep it up

  3. Thanks for sharing some real world experience.

    I don’t see anything heretical about sharing a database between services. As you correctly point out, explicit boundaries are about what the service exposes, not how the service is implemented. In WSE 2.0 and Indigo services can share a process, and I don’t think anyone is advocating that services within a process boundary should not share state within the process so long as that implementation detail is not exposed to service clients. Sharing a database seems no different. Am I missing something?

  4. Dan Mork says:

    [RamKoth] "Finally, we used web services as a implementation mechanism to expose the services."

    [RamKoth] "This entity can be updated by all the services. So, we had to implement a nasty ‘merge replication’ like infrastructure over web-services in order to keep the customer data in sync."

    [DanMork] What was your SOAP stack? I’m assuming ASMX. If Indigo were available, how do you think it would have affected this scenario?

    Cheers,

    ~Mork

  5. ramkoth says:

    Mork – I don’t think the design is anything to do with a particular SOAP stack. It is to do with core tenets of SOA.

    Stuart – Boundary tenet combined with Autonomous tenet is what creates the issue. I agree that boundary tenet doesn’t imply anything about data, but the ‘autonomous’ tenet does imply that the services don’t share data.

  6. Manz says:

    Ram,

    I just wanted to restate a scenario to understand your point on service boundaries and data sharing better.

    Let say that I have a CRM service that knows details about the customers – name address etc… and an order management service that does order fulfillment etc. However, the order management service needs to talk to the CRM system to find out the address and other details about the customer.

    Assuming I need to display the list of orders for a customer from a given region, I would need to get data from both the services. Now there are two ways of getting this done

    1. Without direct CRUD – that is

    – Ask the CRM service for all the customers in the region

    – Ask the Order Management service for all the orders for the given list of customers

    2. With direct CRUD

    – Build a new service that exposes a "GetOrdersByRegion" interface

    – Give it read access to required tables in the CRM and Order management database

    – Do a direct DB read and get the result.

    I understand that to achieve the required QoS one would need to go with the second option. However, to me the new service looks like a "Friendly Service" that the CRM and Order Management Systems have to trust – which leads to tight coupling. One of the goals of SOA is flexibility by localization of impact due to change . In the above scenario, if I end up replacing my CRM service with a product, it could potentially have an impact on my "Friendly Services". Agreed that the impact will be limited to "Friendly Services".

    Please comment… is this obvious and given or am I missing something here?

    Also in continuation of the same in a larger context – would SOA impact datawarehousing architectures? How?

    Thanks,

    Manz

  7. Ramkumar says:

    Manz,

    Your scenario fits perfectly into what i am describing.

    Comments on your approaches.

    1) This is essentially STP (Straight through processing). The performance will suffer if the result set is huge. You are trying to do a cross join of customer and order entity across services.

    2) This is what we call as ‘Hack’. I have seen different customers using this approach. But, i can assure you that you will rewrite this once you upgrade or change the legacy application – which will happen sooner or later.

    Comparison to datawarehouse and replication model:

    This comparison is inevitable. But, there are few differences that one should remember.

    1) Data warehouse stores historical information where as entity aggregation service doesn’t store historical information.

    2) Staleness is tolerated in datawarehouse where as staleness is not tolerated in entity aggregation scenario.

    3) Datawarehouse is read-only where as EA service is read/write.

    4) Analytical vs Operational aspects.

    Some of the low level mechanisms such as ETL can be used between EA service and datawarehouse though.

    Thanks

    Ram