Application models for Astoria

Article
05/04/2007

When you combine technologies such as Silverlight or AJAX-style applications with Astoria there is an opportunity for building great interactive data-driven applications. However, this combination also results in a new ways of organizing the various pieces that make up the application, so the question on what is the right application architecture comes up.

I’ll tackle the question from the middle-tier/data service perspective, and how the interaction with the presentation tier works. I won’t touch on aspects related to building and managing the user-interface elements of the presentation tier.

One extreme: the "pure data" model

There is a class of applications that are just about data, with relatively or no semantics on top of it; or more likely, where semantics is deeply embedded into the data and doesn’t need a layer of behavior on top to surface or enforce it.

For those applications you can imagine pointing the Astoria service to the database, prepare a nice EDM schema that models types and associations appropriately, and leave the URI space open so client agents can freely use any part of it. This is great when the high-order bit is to share data with people or systems. The data in the store is readily available and with easy access; many tools can deal with HTTP and URIs; Astoria-aware tools and widgets can even make it trivial to display and manipulate data.

You would still setup policies for authorization and such, of course, but I see that as somewhat orthogonal to the semantics/behaviors that may go with the data. From the authorization perspective you would say who can read, who can write, when do you need to be authenticated to even see something and so on. It does happen often that some security aspects are derived from semantics (e.g. "my customers" implies that some components knows how to find customers that are assigned to me, and that is typically captured as data as well), in which case the "pure data" approach won't work in that particular scenario.

I’m not sure if there is a lot of applications that fall in this category, but I'm pretty sure its > 0.

The other extreme: the "pure RPC" model

Elaborate applications, such as those designed to support the operations of businesses out there, need more than "just data" pretty much always, at least for large portions of the data they operate on.

A way of approaching an application with these requirements is to build the core business logic on top of the data, and then provide an interface or façade that clients and other applications can consume. This is nothing new, and has evolved in what is currently called Service Oriented Architecture or SOA.

One of the key aspects of SOA is that contracts are clearly defined, and they are described in terms of operations; of course that the data used by those operations needs to be described as well, but the focus is on operations.

I think that there is a space where this is the right way of creating apps. It does introduce a bit of complexity, but you get in return the ability to build composite applications based on the metadata that describes individual components.

We (Microsoft) have a solid offering in this space through the Microsoft Windows Communication Foundation (WCF) stack. WCF handles the runtime and design-time aspects of this, supports various services and related technologies, and it even plays well with both the SOAP and the HTTP-only (should I say REST?) ways of doing things.

The main drawback of this approach, though, is that it's very hard to build generic frameworks/UI components on top. For example, if you had a "grid" widget, how would the widget be able to do paging and sorting (when the user clicks on a column heading) independently of the service operation being called? I am assuming, of course, that bringing all the data to the client and perform the operation there is not an option (as is the case in any reasonable sized data-set). I think there is an opportunity there.

A hybrid approach

Now that we went through the extremes, let me describe a hybrid approach that looks really promising in my opinion.

Typically in business applications you can find a mix when it comes to the data they consume and its requirements for business logic and enforcement of semantics on top of the raw data. I think that often in these applications you can find a part of the data that is still "plain", that doesn't need added business rules, it is just what it is; you also find the other part that does need business logic on top to make sense, or to say consistent, or both.

The relative sizes of the "pure data" and "data + behavior" parts depends on the nature of the application. I would expect, for example, that a public web site that is designed to share information will lean towards the "pure data" side, where as an internal business application would go the other way.

As I discussed in the Astoria Overview document, one of the key elements of the Astoria services is the use of a uniform URI format regardless of the particular data service you're hitting. This enables tools and frameworks to be built so that they leverage that.

These applications could be built by using two Astoria elements: flat access and "data aware" service operations.

For the parts of the data that are "just data", you could enable direct access to the data through regular Astoria URIs. This helps with development productivity, allows developers to use common widgets and libraries and reduces how much redundant code you write and maintain.

For the parts of the data that need business logic, you create the special form of WCF service operation that Astoria introduces support for, where you don't just return the final data that you want. Instead, you return "query objects" that represent the data you'd like to return. Since what is returned is a non-executed-yet query, the Astoria runtime can still compose query operations on top, such as sorting and paging, and then executing it in order to obtain the data to be sent to the client.

Let me use a brief example to illustrate this point. If you have a service operation that returns Customer entities for a given city, you would use a URI like this one to invoke it:

https://host/vdir/northwind.svc/CustomersByCity?city=London

Now, if this is a data aware service operation, the URI options that UI controls and such use to do their work would still work, so when the user clicks on the “CompanyName” column to sort it, the UI control could simply say:

https://host/vdir/northwind.svc/CustomersByCity?city=London&$orderby=CompanyName

in similar ways, using control arguments such as $skip and $top, the grid widget could support paging over any arbitrary data aware operation.

This offers a great middle-ground between plain open access to the store through URIs and fixed RPCs that do not support query composition. The service operation is still written as code in the middle tier that is controlled by the developer; that code can perform validations on arguments, adjust arguments based on the user or the context, inject predicates to queries based on various conditions and so on. In other words, they are the spot where you can invoke your business logic.

I focused on data retrieval so far. The hybrid approach can be used for updates as well. In those cases where it makes sense to allow direct updates to the data you can let the standard HTTP POST/PUT/DELETE operations do its work; for the other cases, where you do not want just an update but a more complex operation that may result in one or more updates, you can use a service operation.

A complementary tool: interceptors

A broad subset of the cases where you want custom business logic during updates, the requirement is to be able to perform validation that goes beyond the declarative validation that can be enforced by constraints in the EDM model or in the underlying database schema.

For those cases you can still use regular HTTP-style updates with POST/PUT/DELETE verbs, and register code to run in the server whenever an entity is being created, updated or deleted. In Astoria you can define an "interceptor" and bind it to a given entity-set and a direction (reading or writing); the interceptor is a regular .NET method defined in the service class that runs in the middle-tier (where the Astoria service runs). This method can do pretty much anything it wants (Astoria won't limit what it can do, although the runtime environment may). Astoria will even provide an already-established session to the database wrapped in an ADO.NET typed object context to provide easy access to the database in case you need to perform lookups in order to validate the operation.

BTW - if anybody can think of a better name than "interceptors", it'll be happy to take it :)

For more details and an example of this you can take a look at the "Intercepting Entities on Retrieval and Update" section of the Using Astoria document.

Summary

So, summing it up, I think that there is a good middle-ground between flat open access to data and pure RPC. I think that this middle-ground can enable frameworks and tools to help more and make it easier to build applications.

-pablo

Application models for Astoria

Additional resources