Serializing the WCF Data Service Client State

With the release of Windows Phone 7, developers using WCF Data Service Client often find themselves needing to preserve the state their data context is in, and be able to continue from a previously saved state. This is called “Tombstoning” an app – the process which the phone framework notifies the app for shutting down, and later on coming back online. As a result we have been looking at designing a set of APIs to make saving and restore the client state an easy process. The already released CTP contains a “preview” to what we think the API should look like. Since then we’ve had a number of iterations on the design while incorporating feedbacks from the community. Here I’ll talk about some of the issues we are trying to address, and some general guidelines on making your serialization experience better.

Data Service State

When I talk about the “State of the client”, what I mean is one DataServiceContext instance containing a graph of entities, and a set of root level collections (may or may not be DataServiceCollections) that serve as “entry points” into the graph. Consider your typical master-detail page bound to a query of Customers expanding each customer’s Orders, in this scenario there is only one root level collection containing the customers. When you want to capture the state this client is in, you must consider both the context and the collection, although they will both eventually lead to the same set of entities. Hence the API we added in the CTP release requires the caller to pass in both. In this iteration we kept the same API (with a bit of renaming/refactoring around the return type), and I think it works best given the scenario we are going after. More details on that below. The new API will capture the sematic of “serialization”, and will return a string representing the serialized state. You can save this string to isolated storage or the application state, depending on how large your context is (larger states should be saved to isolated storage). These APIs are:

 String DataServiceState.Serialized(DataServiceContext context, Dictionary<String, Object> rootCollections);
DataServiceState DataServiceState.Deserialize(String serializedState);

Serialization of entity graphs, or whatever you might call it, is the most important scenario develops are facing today that the CTP did not address properly. The essence of the problem is universal, not just limited to the data service world. It’s how we represent references to the same object instance from multiple edges (including itself). If you’ve tried our CTP library and used the DataServiceState class then you’ve probably ran into this problem – and probably without knowing that you have. The symptom is non-trivial, you will most likely get an exception during deserialization complaining about “An entity with the same identity has already been tracked”, or worse, you could start making changes to your entities through the binding collections but not able to save these changes (SaveChanges does nothing). These are all caused by the fact that reference to the same entity instance has been deserialized into multiple copies.

Fortunately, the DataContractSerializer (used by NetCF to “tombstone” the application state) already has an API to deal with references. The way to do it on the NetCF is to mark ALL entities in the graph with DataContractAttribute, and set “IsReference” to true in the attribute declaration. In fact you can do this today with the CTP library and it will handle graphs nicely, although it will be a pain to do this by hand – note that if you mark a class with DataContract, then you must also mark its members with DataMember, since data contract uses an Opt-In approach. Because of this, we are also updating our code-gen assembly to automatically emit these attributes for you.

This does mean that you now absolutely have to bundle up everything related to your graph as a package and serialize that in one call. If you break away from this restriction then the graph will be broken, and the multi-instance problem is back. For example, if one of your class implements IXmlSerializable, and happens to have links back to the graph (a navigation property, for instance), then when DataContractSerializer encounters this object, it calls the WriteXml method, at which point the graph has been broken and you cannot continue to preserve reference for entities that are already written out. This is one restriction that one must live with – if you want to use IXmlSerializable on one of the entities, then you must use it on ALL entities, and do reference resolution on your own.

Projections and Server Driven Paging

Projection comes into play when you want to save a DataServiceQueryContinuation instance, the issue here is the additional “client-side” information that are captured by the query continuation are not easily serializable. These information are required to deserialize payloads returned from executing the continuation. For example, if you have a projection that look like:

 from c in context.Customers
select new CustomerDisplay
{
    Name = c.FirstName + “ “ + c.LastName,
    Phone = c.Telephone == null ? “-“ : c.Telephone 
}

This is an example of what we called a non-entity projection, which means an entity type is been projected into a complex type. Here information sent to the server is captured by the URI:

service.svc/Customers?$select=”FirstName, LastName, Telephone”

As you can see the uri itself is missing the necessary information to construct an instance of CustomerDisplay class. This missing information is referred to as “Projection Plan”, or “Materialization Plan”, which takes the form of an expression tree. It was built from the original query expression tree, and transformed into a set of Materialization calls. Expressions trees are in general non-serializable, so we must discard this during serialization. Because of this, serialization for DataServiceQueryContinuation of queries containing projections that introduce additional client-side information is not supported.

There is the other category of projections that we refer to them as “entity projections”, which means they only contain narrowing down a type, rather than transformation. For example:

 from c in context.Customers
select new NarrowCustomer
{
    ID = c.ID,
    Name = c.Name,
    Orders = new DataServiceCollection<NarrowOrder>(
        from o in c.Orders
        select new NarrowOrder
        {
            ID = o.ID
            Customer = o.Customer
        })
}

The above query can be materialized by just executing the resulting URI directly, since the narrowing semantics is captured by the $select query options. This is a supported scenario. However, this is one more complication. If you examine the output URI for this query, you’ll discover that the back reference to o.Customer is missing from the URI (it didn’t output $select=”Orders/Customer”). This is an unfortunate side-effect for this round. We’ll definitely be looking out how to make the experience better.

As a side note. I advice that you adopt a principal where everything you do on the context should be able to carry across the wire and deliver to the server side, either in the URI or the payload. A transforming projection cannot be expressed as an URI and carried out on the server side as of today, therefore in this case I advice that you bring whatever necessary information down to the client through entity projections first, then transform the result using Linq to Objects instead.

Multithreading

One should note that whenever you call Serialize, we take a snapshot of the context at that instance of time. This means if you are in the middle of an Asynchronous operation, the context will likely be in a transient state, and you might later have problems recovering from this situation. The general advice here is that you should save at certain “check points” where the state is known to you. For example, the WP7 development guideline advices that you save state whenever you navigate away from a page. This could be one solution out of many that are possible. Another possible check point is after all async operation completes. This also means that you definitely do not want to wait until the deactivating event and save there, since the event is triggered by the user and could happen at anytime.

Guidance on Tombstoning the DataServiceContext

The following are key take-away points to help you better support tombstoning in your data service app:

1. Mark all entity types as [DataContract(IsReference=true)], mark the member properties as [DataMember]

2. Do not use transforming projections, instead, do the transformation on entity projection results in memory.

3. Call DataServiceState.Serialize (DataServiceState.Save in the CTP) when you know the context is in a steady state – i.e., not in a middle of asynchronous operation.

4. In deactivation event, save the serialized string to IsolatedStorage if your context is large (100+ entities). Save to application state if your context is small (faster).