One of the core tenets of SOA is that the services have a clearly defined boundary. Services explicitly choose to expose certain (business) functions outside the boundary. Another tenet of SOA is that a service - with in its boundary - owns, encapsulates and protect its private data. But, there are not too many guidelines out there that talks about what should be exposed and how the boundaries should be drawn.
Keep an eye on 'Data cohesion' when you are designing the service boundary. I will give you an example. A while back, I was involved in architecting a product that has many different modules - like sales, finance, inventory, CRM, billing etc. Our natural instinct was to define each of these modules as a 'Service'. So, we went ahead and defined a Sales service, CRM service, Finance service etc. We defined the entities that are required for each one of these apps and designed the database schema etc. Also, following SOA's design principles, we defined the contracts for the services and exposed different functionalities. Finally, we used web services as a implementation mechanism to expose the services. At the end of the day, we thought we had a pretty good design.
But, things started falling apart the moment we started integrating different services. The main problem we faced was that the 'Customer' entity is required by all the systems. This entity can be updated by all the services. So, we had to implement a nasty 'merge replication' like infrastructure over web-services in order to keep the customer data in sync. To avoid building merge-replication infrastructure across service boundaries, we decided to remove the so called common entities (customer, product etc) and created a separate module or service for them - conveniently named common module. The idea here is that the common module provides services that abstracts the common entities and exposes operations for CRUD behavior (I know…CRUD is a bad word). There is another way to do that as well. For each entity, designate a service as a owner (a.k.a., System of record). You can think of Common module as a system of record for the common entities. So, in theory, both the approaches are equivalent.
So, we went down the path of building a common module (system of record) for common entities. Things worked for a while, before we got into implementing a scenario that actually required a cross-join between product entity and a related entity owned by order management service. Mind you, this is really not a business intelligence scenario. This is a simple search screen that should go to database, do the search and show the results for end-user to pick an item. Performance will go down the tube, if you implement a cross-join between entities owned by different services. So, you have to build EII (Enterprise Information Integration) like solutions, which will most probably replicate related entities, so that joins can be done efficiently in database. And, we all know that building an EII solution is a costly affair.
So, what is the solution? Here is a practical one. "Try to share the database between related services.“
But, 'wait a minute', you may ask - 'you are saying something heretic. Services are not supposed to share data between them'. Well, you are right. But, remember the problems created by simply adhering to core tenets of services. One of the problem with the core tenet of services is that the boundary tenet doesn't talk about data, it only talks about what should or should not be exposed. Since it doesn't talk about data, it is very easy draw service boundaries that may create data related problems later on.
If we still want to remain a purist, we have two choices: 1) re-interpret/relax service boundary rule 2) define a new type of service called business service. I believe option 2 is probably the best path.
Think of business service as a software system that automates one or more business capabilities. A business service may expose one or more services that correspond to each business capability. But, all those services share a logical data model. In other words, boundary is defined at the business service level and not at the individual service level. This is an important lesson we learnt.
For example, think of your favorite ERP system (SAP, Peoplesoft etc) as a business service that exposes one more services - one for HR management, one for Inventory, one for Order management, one for Accounting etc. But, they all share single logical data model. Ever wondered, why big ERP systems such as SAP have a single database that contains the tables required by all the modules? I tend to think this is the reason why they did it. But, I may be wrong.
So, here is the summary of key practical learnings
- Define boundary at the business service level.
- Business service may contain one or more services, but share data.
- If you have legacy systems that you have to integrate, then you can't do much. Business services and their boundaries have been defined for you.