Retrieving more than 1000 rows from windows azure storage



I recently hit an issue with my code when trying to retrieve
data from Windows Azure Diagnostics table, 
I noticed that I’m getting only 1000 rows every time even though I have
not set any limit on the number of rows I want to retrieve, it turns out that
there is a limit to 1000 rows for windows azure storage.

The limit is not only on the number of rows, there is also a limit on the amount of time it takes the query to execute, the
Table service may return a maximum of 1,000 items at one time and make take only up to 5 seconds. If the result set contains more than 1,000 items or takes too long to execute the table service will return a Continuation Token  that you can use to retrieve the rest of your
rows data, when you received all your data your token is null.

Originally this was my code:

1: WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);

   2:   
   3:  var items = from x in wadContext.WADPerformanceCountersTable
   4:  where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
   5:  select x;
   6:   
   7:  foreach (WADPerformanceCountersTable item in items)
   8:  {
   9:     //use item
  10:  }


This is the code after adding the use for continuation token:

1: ListRowsContinuationToken continuationToken = null;
   2:   
   3:  CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
   4:             
   5:  WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
   6:   
   7:   
   8:  do
   9:  {
  10:      //Query using the start time and end time, take only n rows
  11:      //The response headers will contain a continuation key that can be used to retrieve the next n rows
  12:      var allItems = (from x in wadContext.WADPerformanceCountersTable
  13:          where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
  14:          select x).Take(c_pageSize);
  15:   
  16:      var query = allItems as DataServiceQuery<WADPerformanceCountersTable>;
  17:   
  18:      if (continuationToken != null)
  19:      {
  20:          query = query.AddQueryOption("NextPartitionKey", continuationToken.PartitionKey);
  21:          if (continuationToken.RowKey != null)
  22:          {
  23:             query = query.AddQueryOption("NextRowKey", continuationToken.RowKey);
  24:          }
  25:      }
  26:   
  27:      var response = query.Execute() as QueryOperationResponse;
  28:   
  29:      foreach (WADPerformanceCountersTable item in allItems)
  30:      {
  31:          //use item     
  32:      }
  33:      //now check if there are more rows left to be retrieved, if there are more rows execute another request to get the remaining items
  34:      if (response.Headers.ContainsKey("x-ms-continuation-NextPartitionKey"))
  35:      {
  36:          continuationToken = new ListRowsContinuationToken();
  37:          continuationToken.PartitionKey = response.Headers["x-ms-continuation-NextPartitionKey"];
  38:          if (response.Headers.ContainsKey("x-ms-continuation-NextRowKey"))
  39:          {
  40:             continuationToken.RowKey = response.Headers["x-ms-continuation-NextRowKey"];
  41:          }
  42:      }
  43:      else
  44:      {
  45:          continuationToken = null;
  46:      }
  47:   
  48:  } while (continuationToken != null);
  49:   

 

ListRowsContinuationToken is a simple class defined as
follows:

   1:  public class ListRowsContinuationToken
   2:  {
   3:      public string PartitionKey { get; set; }
   4:      public string RowKey { get; set; }
   5:  }



Finally, I found an even easier way to get this done! The trick is to convert the query into a CloudTableQuery – once you do this you will be able to retrieve all rows with a single call (just make sure that this is what you really want to do)

   1:  CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
   2:  WADPerformanceCountersTableContext wadContext = new WADPerformanceCountersTableContext(cloudStorageAccount.TableEndpoint.ToString(), cloudStorageAccount.Credentials);
   3:   
   4:  Console.Out.WriteLine("Reading data from WAD storage");
   5:   
   6:  //Query using the start time and end time, take only 1000 rows
   7:  //The response headers will contain a continuration key that can be used to retrieve the next 1000 rows
   8:  var query = (from x in wadContext.WADPerformanceCountersTable
   9:               where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks
  10:               select x);
  11:   
  12:   
  13:  var allItemsAsTableService = query.AsTableServiceQuery();
  14:  IEnumerable<WADPerformanceCountersTable> allItems = allItemsAsTableService.Execute();
  15:   
  16:  foreach (WADPerformanceCountersTable item in allItems)
  17:  {
  18:  //use item
  19:  }


I chose the last solution because I needed minimal changes to my code, and I didn’t really need paging, if you need paging, continuation token will be the way to go!

Comments (4)

  1. Matt says:

    Where are you getting the WADPerformanceCountersTableContext & WADPerformanceCountersTable from?  What namespace is this available under?

  2. Riham says:

    Hi Matt,

    You can implement these yourself,

    public class WADPerformanceCountersTable : TableServiceEntity

    {

       public long EventTickCount { get; set; }

       public string DeploymentId { get; set; }

       public string Role { get; set; }

       public string RoleInstance { get; set; }

       public string CounterName { get; set; }

       public double CounterValue { get; set; }

    }

    public class WADPerformanceCountersTableContext : TableServiceContext

    {

       public WADPerformanceCountersTableContext(string baseAddress, StorageCredentials credentials)

           : base(baseAddress, credentials)

       { }

       public IQueryable<WADPerformanceCountersTable> WADPerformanceCountersTable

       {

           get

           {

               return this.CreateQuery<WADPerformanceCountersTable>("WADPerformanceCountersTable");

           }

       }

    }

  3. Sergei Almazov says:

    For sake of future readers landing on this page: it's much easier to use AsTableServiceQuery() extension method that encapsulates several query features of Azure Table Service including continuation tokens processing. So initial code can take advantage of all continuation tokens just by replacing the following lines:

    var items =  from x in wadContext.WADPerformanceCountersTable

      where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks

      select x;

    with

    var items =  (from x in wadContext.WADPerformanceCountersTable

      where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks

      select x).AsTableServiceQuery();

  4. Riham says:

    Hi Sergei, that's the recommended solution:

    var query = (from x in wadContext.WADPerformanceCountersTable

          where x.EventTickCount >= start.Ticks && x.EventTickCount <= end.Ticks  select x);  

    var allItemsAsTableService = query.AsTableServiceQuery();