The Value of Intermediation, part two

I got a particularly valuable response to my previous post on the Value of Intermediation in SOA.  In that response, John suggested that my post should be called the benefits of intermediation, but that to call it the 'value' of intermediation, I would have to consider the costs.  He then went on to outline three costs.  I'd like to address all three in this post.

Cost concern 1: Messages must contain contextual information

To quote John:

Requiring intermediation encourages messages to carry more contextual information. Because the sender can't make any assumptions about the receiver, the messages need to explicit and self-contained That doesn't sound too bad, but, given that you often don't know anything about receiving systems (they may not even exist yet), it is easy for messages to become bloated.

This is altogether a fair statement.  In order for intermediation to be a strategy that can be successfully employed, the messages must be explicit and self contained.  They must, in fact, be canonical.  Now, I don't suggest that every message on every endpoint must be canonical, so it is fair to say that I agree with this statement.  That said, if we are actually using SOA to compose an application out of services, and not just as a pattern for building an application, then we want each of our services to be independent of one another, and to be composable seperately from the rest.

To achieve this is no small feat.  Certainly, an application developer using a SOA pattern may wish to create a small set of services that are very specific to that app, and are not intended for reuse.  Clearly, in this case, using a canonical schema is probably a limitation.  I would not expect that these services have any need to be intermediated, and context can be assumed.

However, the strength of SOA is not in the one-off local service that is not designed for leverage.  The strength of SOA is in leveraging services.  So, let's look at that case.  What is needed in a service to make it truly reusable?  I guess I'm not asking "What is the minimum requirement to call something a service?" but rather I'm asking "What is the minimum requirement to call a service composable and therefore reusable?"

I would suggest that a service, when composable, (a) provides information in a manner that can be readily reused without requiring multiple calls to other services for interpretation, and/or (b) provides functionality that can be executed to produce specific semantics in an encapsulated manner without requiring references to multiple other services and without unknown or undesirable side effects. 

This is not easy.  I have blogged about this before.  A truly 'enterprise' service, one that is composable and reusable, and therefore valuable to an enterprise in an agile manner, has a high bar to meet. In order to meet it, specific semantics must be considered and specific context must be included.  The price of making this service valuable is that the message going in and out of the service is partially denormalized and mapped to a special integration model called a Canonical Data Model. 

I argue, passionately, that a service that does not leverage a Canonical data model (explicit or implicit) is not useful outside a small handful of very specific situations.  Such a limited service provides some small benefits, but probably not more than a COM+ component would, and certainly not enough to justify all this interest in SOA.  We get no business agility from this kind of service.  Therefore, as an Enterprise Architect, you can create it, but I will not use it, nor will I look kindly on another application that does.  Poor integration is just barely a half step up from no integration.  Some would argue it is a step down.

Given that, the necessary increase in size of a single message in order to represent the independent canonical business document that we will pass in an enterprise EAI scenario is a small price to pay to achieve business agility.

Cost myth 2: Intermediation complicates MEX

Intermediation greatly complicates any message exchange pattern other than request-response and pub-sub. In those situations, it's usually best to have a protocol for other systems that intervene. Imagine dancers doing the tango. If you're going to cut in, it's considered polite to ask permission.

Not sure that the analogy is a good one.  Let me give you a different analogy.  One that is more similar to actual business messages.

Joe works in a very small bank in a small town in Iowa (perhaps the same small town I once lived in).  The bank has a single branch in a single building in the downtown area, and has not changed much since it was founded in the 1940s.  Nearly all business is conducted using paper forms and slips.  There is a single old AS/400 that runs some terminals, but that's the extent of the computerization.  Joe's job is the same as his father's was.

Joe decides to take a vacation in August to take his kids to Disneyworld.  He goes to his boss and mentions the vacation.  His boss asks him to fill out a form called a vacation request form.  On that multi-part form, Joe types his name, the dates of the vacation, and his payroll id number (from his paycheck stub) using a typewriter.  He hands the form to his boss. 

His boss checks the calendar and decides that it would be OK for Joe to take his vacation at that time.  His boss then tears off the back sheet and adds it to his file.  He writes Joe's time on his 'people calendar' and gives the form to the branch manager.  He also tells Joe that his vacation is approved unless he hears otherwise.

The branch manager tears off another copy to and adds his notes to his file under the heading of August.  The branch manager is responsible for scheduling the staff hours each week, and he will look up Joe's form when he is scheduling work hours for the month of August.  Joe's job will be covered.  The branch manager hands the final page to the accounting clerk.

The accounting clerk checks Joe's employee record to make sure that he will have the right amount of vacation hours in August.  She notes his record to show the intent to take time off, so that when she figures payroll for August, she can deduct the vacation hours.  If he doesn't have the right amount of hours built up, she will tell Joe and his manager.  But he has the hours, so she says nothing. 

Now, the interesting thing is that all this activity goes on as a result of Joe handing in a form.  The form is the message.  Joe is neither consulted nor informed about what steps will occur to his message.  He is informed that it is approved (status changed) before the entire transaction is complete.  If the bank president were in the branch manager's office, and he saw the form laying on the desk, he could look at it, and Joe would be unaware.  The bank president has the right.  Certainly if a customer were to see the form, that would be bad.  But it is up to the bank manager to insure that the form is suitably secure.  Joe has to trust him.

This process is neither pub-sub nor request-response.  It is a long-running orchestration with compensating events.  The process is NOT complicated by the ability to intermediate. 

In fact, I would argue that nearly all valuable long running transactions MUST have the ability to intermediate in order to allow them to be composed, and recomposed, and orchestrated. 

Cost myth #3: semantic bleed

John's last quote goes like this: 

Finally, intermediation requires the intermediating system to understand at least some of the semantics of the messages it intercepts. Otherwise, it risks breaking a functioning system. Unfortunately, those semantics are never contained completely within the messages themselves.

That depends entirely on what it is doing to the message.  If intermediation means making a copy of the transaction and adding further side effects, then we would have to be reasonably certain that the services are isolated from one another in order to be comfortable that we didn't break a functioning system.  Adding an audit ability, so that a bank examiner could see all of the files of the branch manager, does not risk breaking the system. 

On the other hand, if the bank president were to steal the vacation request off the branch managers desk and shove it in his pocket, then that would break part of the system.  So, yes, some types of intermediation do require that the system that gets the message or delays the delivery of the message does so in a manner that is unlikely to break the semantics.

It is interesting that the Service Component Architecture model (www.osoa.org) has a mechanism for declaring the composition of services from other services, and that mechanism doesn't attempt to capture this aspect.  Why do you suppose that is?  Perhaps it is because services, correctly declared and correctly composed, would have well understood semantics.  It should be fairly obvious when a change to the workflow is likely to cause a break. 

So I would not call this one a cost.  There is no cost here that is not already included in any software development project.  This is a risk.  Not a cost.  It is a risk that occurs in any project.  In a well defined SOA infrastructure, the risk is Lower than if we were attempting to compose a manual business process by entering data in three different computer systems.  (armchair integration).

Conclusion:

The value of intermediation comes from the flexibility it provides.  That flexibility only works if the messages are Normally passed using some basic constraints: canonical, simple, well understood. 

In conclusion, I don't say that intermediation is a requirement of a service oriented architecture.  But I do say that intermediation is a requirement of a service oriented architecture designed to deliver composability, and therefore, business agility.

That's the one I care about.