More on parallel queries across containers in SSDS

Our current implementation of cross-Container queries follows a very common pattern and roughly looks like this (no exception handling shown for simplicity):

 public List<T> CrossContainerSearch( string[] containerIds, 
                                     string query, 
                                     SearchDelegate searchDelegate )
{

    Event[] events = new Event[ containerIds.Count() ];
    Results<T> results = new Results<T>();
    for( int x = 0; x< containerIds.Count(); x++ )
    {
        State s = new State();
        Event[x] = s.Event = new Event();
        s.Query = Query;
        s.Results = results;
        s.ContainerId = containerIds[ x ];
        s.SearchDelegate = searchDelegate;
        ThreadPool.Queue( new WaitCallBack( worker ), s );
    }

    WaitHandle.WaitAll( events );
    return results;
}

 

The State object is used to pass parameters to the worker (e.g. the query to execute, the delegate that the worker method will call to actually perform the search, the event to signal when the query is done, etc.).

The search workers are queued in the ThreadPool, eventually a thread will pick them up and execute. WaitHandle.WaitAll simply blocks until all events are signaled by all scheduled callbacks. When this unblocks, it means that all queries have completed. This is equivalent to a WaitForMultipleObjects API call.

The worker method looks like this:

 void Worker( object s ) 
{ 
   State state = (State) s; 
   state.Results.AddRange( state.searchDelegate( state.ContainerIdstate.Query ) ); 
   state.Event.Set(); 
} 

First we recover the state, the search delegate is called with the query, the results are stored, and the event is set to signal completion.

My friend Arvindra suggested using CCR (Concurrency and Coordination Runtime) for this. I'll definitely take a look at that.

In the meantime, I re-implemented the same code using .NET parallel extensions which simplifies things greatly. A lot of plumbing goes away, and it's much simpler and easier to read. It roughly looks like this:

 

 public List CrossContainerSearch(string[] containerIds, 
                                 string query, 
                                 SearchDelegate searchDelegate )
{
   Results results = new Results();   
    Parallel.For(0, containerIds.Count(), i =>
   {
      results.AddRange( searchDelegate( containerIds[ i ], query) );
   }
   );
            
   return results;
}

 

Notice that the scheduling, join, state management, etc. it's all handled by Parallel.For. Hard to beat in simplicity, isn't it? If you still want some deeper control on task scheduling then Task & TaskCoordinator types are your friends.

I will continue some research on this and publish my findings. I'm quite intrigued with more performance information so I'll be working on more formal tests. All I need now is a machine with 16 cores ;-).

The Parallel Programming blog is here.