Paging in SSDS & Parallel Queries

Tim Jarvis raised a good point in my post on cross-container queries which is: how to handle paging? SSDS currently supports a very simple paging pattern that uses the entityId. By design, the first 500 entities will be returned, but the entities will be returned in entityId order. So, you can get the next 500 instances with a query like:

from e in entities where e["FieldName"]== "Field Value" && e.Id > "LastId" select e";

You can then iterate over the returned list if entities, check if you got less than 500 entities (that means that there are no more), or continue querying if you got 500 until you get less than that. In each iteration you update LastId of course.

Now, back to parallel queries, you could add an extra parameter to specify the maximum amount of total entities you are expecting as a result, monitor the amount retrieved by each thread and cancel any outstanding (parallel) queries if the limit is reached.

If you want to implement a "get me the next 1000" across containers, then of course you need to implement a way to keep the list of LastId's on each container. Probably returning this as a "query state object" to be passed back to the method afterwards.

In any case, in my experience, in most interactive scenarios where you are searching, result sets greater than 2 or 3 pages means you are not filtering adequately and maybe another approach is better: "hey, we've found lots of entities with that search criteria, maybe you should refine it?...".