Custom paging provider

The OData protocol supports a feature called Server driven paging. It is used to limit the amount of data client can query with a single request while still providing a way from the client to get all the data (in multiple requests). A more detailed explanation is for available for example in this blog post.

WCF Data Services implements this in two possible ways. The easy to use built-in server driven paging implementation and the more complicated but powerful custom paging. The built-in server driven paging implementation simply puts a limit on the number of results returned for a specified entity set. The way to set it up and use it is described for example here or here.

But the protocol allows the server to split the results into multiple pages in any way, not just number of results. In fact clients don’t know how the server decided when to split the results. So you may want to limit the time it takes to execute the query, or the amount of memory consumed or maybe even limit the query on a subset of nodes (in a distributed system) and return a page per node, and so on. To support these scenarios the WCF Data Services allows custom implementation of the server driven paging. It is done by implementing the IDataServicePagingProvider interface. Let’s take a look how that might work.

Prerequisites

Custom paging can only be implemented if our service already implements custom metadata and query providers (IDataServiceMetadataProvider and IDataServiceQueryProvider). For samples on these take a look at this great series of posts.

The custom query provider will also have to implement custom IQueryable. We typically already have this when we implement custom metadata and query provider, but it’s not a requirement. For custom paging this is absolutely a requirement as there’s no practical way how to tweak the existing IQueryable implementations (like LINQ to EF, LINQ to SQL, LINQ to Objects, …) for this purpose. On the other hand, there’s nothing preventing us from implementing the IQueryable as a layer over an existing implementation.

And finally we implement the IDataServicePagingProvider interface and return its implementation from our IServiceProvider. The easiest way to describe how to implement the two methods the interface has is to describe the steps which the WCF Data Services takes to process paged queries.

How does it work

Let’s walk through a query processing for a simple query to /Customers.

1 – Request for the entity set

Client sends a request to the service for /Customers.

2 – Query processing

On the server the query gets processed and eventually is passed to the custom IQueryable implementation. This implementation must know that for Customers set the service should return paged results and should produce a query for the first page of the results.

This part depends on the paging behavior we need to implement. We may modify the query to prepare the underlying storage for the paging. We may also need to inject some filter or limit into the query. For example the built-in paging implementation will add sorting to the query by key properties, so that it gets stable order of results and then it will request only the first page of results (by taking first n results).

3 – Enumeration of results

When the query executes the IQueryable.GetEnumerator gets called. Since we have a custom IQueryable implementation we get called here. We need to return a custom enumerator which will only return the first page of results. The end of the page is marked by simply reporting no more results from the enumeration.

Depending on how we chose to limit the page of results this may be a trivial operation (if the limit was built into the query by our IQueryable) or this may actually do the real work. For example here we could measure the time it takes to compute the results and decide to end the page if it takes too long. Note that it’s perfectly fine to return no results (clients will expect this).

No matter which approach we take the important part is that the enumeration must somehow track where it ended in the bigger result set (the one with all the results, the non-paged one).

4 – Getting the continuation token

Once the IEnumerator returns no more results, the WCF Data Services will call IDataServicePagingProvider.GetContinuationToken and pass in the enumerator. Now we need to extract the information of how far we got through the results from the enumerator and serialize it as an array of primitive values and return it.

The GetContinuationToken returns an array of objects. These need to be primitive values (as recognized by OData) and they should be reasonably small since since all of those will be written into the URL.

The continuation token must fulfill several requirements:

- It must not be temporary, clients can in theory hold on to it indefinitely. We may choose to only support certain lifetime, but it needs to be rather long (typical user sessions or something similar). The best would be to have indefinite lifetime of the continuation tokens.

- It must be persisted – that means it should not rely on any transient data on the server side. For example, storing some table in memory on the server. There’s no guarantee that the client will not ask for the next page after the server process was restarted, or maybe the next request will be processed by a different server in the farm and so on.

- It must “make progress”. If we don’t return any results for any given page, the continuation token should contain enough information so that next time we can pick up from where we ended without trying to compute the entire page again. Otherwise this may lead to endless loop. So for example if the page is limited by the amount of time it took to process the query, the continuation token must somehow remember how far the query processing got even if it didn’t return any results.

- It should be reusable. There’s nothing prevent the client from asking for the same next page multiple times.

5 – Response to the client

The service serializes the returned continuation token into the $skiptoken query option and returns it as part of the next link to the client. Let’s say that we produce the next link as /Customers?$skiptoken=1234.

 

some time later…

6 – Client requests the next page

The client sends a request for the next page using the next link it got. So for example /Customers?$skiptoken=1234.

7 – Processing the next page query

The services processes the query as usual building our custom IQueryable as usual.

8 – Applying the continuation token

At one point during the query processing the service will call the IDataServicePagingProvider.SetContinuationToken. It will pass in the IQueryable it built so far and the continuation token. The continuation token is deserialized from the URL and will contain the values returned by some previous call to GetContinuationToken.

The method needs to “apply” the continuation token to the query. The goal is to modify the query such that it will continue from where it left of as described by the continuation token and return the next page of results.

Couple of notes on implementing this:

- The SetContinuationToken doesn’t return a new IQueryable, the implementation is expected to modify the existing IQueryable.

- Because the IQueryable might be used for further query construction, the best approach is to just stick the continuation token onto the IQueryable at this time without applying it to the query. And then apply it before the query gets executed and once we are sure we have the entire query.

9 – Enumeration of next page results

The service executes the query and enumerates its results. This should be again limited to a one page of results and the enumerator should stop after it hits the page limit.

As in #4 above the GetContinuationToken is called on the enumerator. If it returns a continuation token, the entire process repeats itself just like in #4 above. If the method returns null (no continuation token), the service will not produce any next link into the results and the client will determine that it got all the results for the query.

Additional considerations

The query might be more complicated that just a simple /Customers. It may contain filters, sorting, projections and so on. The WCF Data Service will include all of these query options in the next link automatically, so the continuation token doesn’t have to remember these.

The paged entity set might not be the top-most entity set of the results for a given query. For example a query like /Regions?$expand=Customers will also returns custom result set. It is a responsibility of the IQueryable to recognize this and apply the paging limits to the expansions as well. And as above once it returns the enumerator for the expanded set and the enumerator returns no more results, the service will call GetContinuationToken on it to get the continuation token for the inner set.

So the GetContinuationToken might get called multiple times during a single query processing. Once for the top-most result set and once for each expanded result set. The SetContinuationToken is only ever called once for a query, and only if it did contain the $skiptonen query option.

Once the IDataServicePagingProvider is implemented, the built-in paging provider is disabled. The SetEntitySetPageSize methods will do nothing then. So it’s either all custom, or all built-in.

If we don’t want to implement paging on a specified entity set, we just return null from the GetContinuationToken when called on the enumeration of such entity set.

All in all, implementing the custom paging provider is a complex task. But it opens very interesting and powerful techniques for the service.