So they're hard, but what if I need them...

Article
12/20/2007

In my last post I started a survey of problems with building data-centric web services. When we left our hero (you my intrepid entity framework programmer) things were looking pretty bleak. So far we’ve talked about the challenges. Now let’s talk about some possible approaches for dealing with those challenges.

1) The traditional SOA / DTO approach. This is decidedly more work but also the most flexible and for many users the best option. It allows you to impose control at the web service juncture, and it makes true interoperability possible. With some patterns this isn’t as hard as it sounds at first, but it is more work—you have to maintain the objects which represent your contract as well as the objects which are in your data model, but the objects which represent your contract can be thin wrappers that don’t re-implement your business logic; they just form a tree of the separate entities (which does serialize nicely unlike graphs) with the required granularity, and they constrain the operations to those you want to allow over the web service.

At another extreme you can:

2) Constrain your scenarios and use standardized operations. This is the approach that the REST community takes, and it’s what we’re doing with project Astoria (aka ADO.Net Data Services). The idea here is that you can create a set of standard operations which is not as broad as the full-featured direct data access APIs but still can operate over arbitrary data that you define in your model. Then you create a few general purpose extension/control points where you can enforce security and filter the operations. With this approach the basic work of creating the web service and such can all be automatic. The result is less service-oriented in a classic sense because the operations really are data access operations rather than messages which describe a higher-level operation, but it’s much less work and will address a number of scenarios. You can, of course, extend this model with some more targeted traditional service operations.

There are a number of other possible compromises/stopping points between the above two extremes. A few notable examples:

3) With graph-serialization you could roll-your own SOA system without separate DTOs. If you do have general-purpose graph serialization then you can automatically solve one of the problems in approach #1, but as I mentioned in the previous post, it doesn’t take much time on this path before you begin to realize that you either still need to write DTOs to deal with change tracking and concurrency issues, or you have to constrain your scenarios in various ways to make it easier to automatically determine the intent of the operations. For example, you might decide that “last write wins” is your policy and drop optimistic concurrency checks—this means that you can just send the graph to the client and then when it is modified you can send it back without having to also include original values. A second constraint comes from the question of what kinds of operations you can do in one round trip: Can you modify two entities that are not connected by a single graph? Can you unhook an object from one graph and relate it to another? What are the transaction semantics? All of these things are subtleties that are addressed in a clear, uniform way by approach 1 or 2 above, but will begin to haunt any sophisticated application that starts to roll their own. My point is that just handling graphs truly isn’t enough.

4) General-purpose container object. This container will hold an entity graph, serialize over the web service, recreate the graph on the other side, perform change tracking on the client & replay that change tracking back on the mid-tier in a form that enables persistence with concurrency checks, etc. If you stare at this description for a moment, you will realize that this is what the dataset does (particularly the typed dataset). While this is appealing because it is quite simple to use and very flexible, it also introduces serious problems when it comes to interoperability and to maintaining the abstraction which the web service represents. If you are not careful, you end up allowing pretty much any operation through your webservice which was in part supposed to constrain the set of allowed operations.

Some have gone so far as to declare this truly evil. I would moderate that a bit and say that there may well be a time and a purpose for this approach, but I am concerned about the fact that the ease of use will entice well-meaning developers down a path from which it is hard to return and which will cause them pain in the end. Nevertheless, I have spent the last week working almost full-time on a sample that provides this sort of functionality over the entity framework. It’s going to take a little while to clean it up and get it in a form where I can share it, but I found it a useful exercise for exploring the space, and it may be helpful to some other folks. I can even see the possibility of us formalizing something around this for a future release of the entity framework, but that’s yet to be determined.

Another idea which has been suggested seems to me to have the potential of being the very best approach yet, but it will take some work to flesh out, and I don’t know of anyone who has put this into practice yet. Maybe we’ll be able to experiment some with this approach for future releases of the EF:

5) Automatically generate DTOs from a declarative description of the contract. The idea is to annotate a conceptual model or maybe separately describe interesting sub-graphs of appropriate granularity and service-oriented operations which constrain what manipulations of those sub-graphs are allowed. In theory this could achieve the advantages of approach #1 above while removing the drudgery of maintaining the DTOs manually and maybe even produce a contract description which could be leveraged for other purposes.

At any rate, my overall point is that there are subtleties here. There are multiple approaches and no one approach will be right for every situation. In general I’d encourage you to avoid approach #3, and I don’t think we fully understand approach #5 yet, but we’ll be working on it for future releases. You can use Astoria today for approach #2, which just leaves approaches 1 & 4. In future posts I’ll work on putting together some examples for each of them. In the meantime, if you’ve got other approaches to suggest, I’d love to hear about them.

- Danny

So they're hard, but what if I need them...

Additional resources