HealthVault: Batching up queries


When I first started using the HealthVault SDK, I wrote some code like this, based on what I had seen before:


HealthRecordSearcher searcher = PersonInfo.SelectedRecord.CreateSearcher();
HealthRecordFilter filter = new HealthRecordFilter(Height.TypeID);
searcher.Filters.Add(filter);


HealthRecordItemCollection items = searcher.GetMatchingItems()[0];


So, what’s up with indexing into the result from GetMatchingItems()? Why isn’t it simpler?


The answer is that queries can be batched up into a single filter, so that you can execute them all at once. So, if we want to, we can write the following:


HealthRecordSearcher searcher = PersonInfo.SelectedRecord.CreateSearcher();

HealthRecordFilter
filterHeight = new HealthRecordFilter(Height.TypeId);
searcher.Filters.Add(filterHeight);


HealthRecordFilter filterWeight = new HealthRecordFilter(Weight.TypeId);
searcher.Filters.Add(filterWeight);
 


ReadOnlyCollection<HealthRecordItemCollection> results = searcher.GetMatchingItems();


HealthRecordItemCollection heightItems = results[0];
HealthRecordItemCollection weightItems = results[1];


Based on a partner question today, I got a bit interested in what the performance advantages were of batching queries up. So, I wrote a short test application that compared fetching 32 single Height values either serially or batched together.


Here’s what I saw:
























Batch Size Time in seconds
1 0.98
2 0.51
4 0.28
8 0.16
16 0.10
32 0.08

This is a pretty impressive result – if you need to fetch 4 different items, it’s nearly 4 times faster to batch up the fetch compared to doing them independently. Why is this so big?


Well, to do a fetch, the following thing has to happen:



  1. The request is created on the web server

  2. It is transmitted across the net to HealthVault servers

  3. The request is decoded, executed, and a response is created

  4. It is transmitted back to the web server

  5. The web server unpackages it

When a filter returns small amounts of data, steps 1, 3, and 5 are pretty fast, but steps 2 and 4 involve network latency, which dominates the elapsed time. So, the batching eliminates those chunks of time, and we get a nice speedup.


We would therefore expect that as we fetch more data in each request, batching would be less useful. Here is some data for fetching 16 items:

























Batch Size Time in seconds
1 1.40
2 0.91
4 0.66
8 0.49
16 0.42
32 0.39


Which is pretty much what you would expect.