Improving Windows Azure Table Storage query performance

With Windows Azure Table Storage, the data is stored in a way which can be searched through queries. By design, Windows Azure Table Storage provides the potential to store large amounts of data, while enabling efficient query access because:

  • No relations between tables (or entities sets)
  • Storing datasets that don't require complex joins, foreign keys, or stored procedures and can be denormalized for fast access
  • Quickly querying data using a clustered index
  • Tables doesn't have a specified schema

There are a number of good practices to follow when designing entities to get the best performance with your Azure Table Storage. However In this blog I would like to share my best practice as described below:

Query tables using both Partition and Row Key to achieve best performance with Windows Azure Table Storage.

In Windows Azure Tables, the string PartitionKey and RowKey properties work together as an index for your table. So when using Partition and Row Keys, the storage will use its index to find results really fast, while when using other entity properties the storage will result in table scan, significantly reducing performance. So from performance perspective, querying on the Partition Key is good, querying on both Partition and Row Key is better, querying only on Row Key is not good (will result in table scans).

Listing: Examples of Azure Table Storage query:

a) Querying on both Partition and Row Key: Query results in retrieving specific/selective data:

var q = (from movie in context.CreateQuery<Movie>(tableName)

where movie.PartitionKey == "Action" && movie.RowKey == "Terminator" select movie);

 

b) Full Table Scan: Query results in scanning the entire table:

var q = (from movie in context.CreateQuery<Movie>(tableName)

where movie.RowKey.CompareTo("Sh") >= 0 select movie);

 

How to collect Table Storage Metrics data from Windows Azure Management Portal:

Above information can be saved into a local .csv file for further investigation. In csv file you would be able use metrics specific data to understand more about your Azure Table Storage performance and find proper solutions to improve it as needed. The definition of Azure Table Storage columns are described below: