Essential Knowledge for Azure Table Storage


This post is devoted to Azure tables
image

  1. Azure tables are used to store non-relational structured data at massive scale

image

What you get How much
Compute 750 small compute hours per month
Web sites 10 web sites
Mobile services 10 mobile services
Relational database 1 SQL database
SQL reporting 100 hours per month
Storage 35GB with 50,000,000 storage transactions
Bandwidth Unlimited inbound & 25GB outbound
CDN 20GB outbound with 50,000 transactions
Cache 128MB
Service bus 1,500 relay hours and 500,000 messages
Sign Up Link   
hyperlink2  

Sign up right here

Azure Storage Options

image

  1. There are many types of storage options for the MS cloud. We will focus on Azure tables.
  2. Here is what we'll cover:
    • When to use Azure Tables
    • When are the appropriate to consider
    • Understanding that Azure Tables are collection of entities
    • Access Azure Tables directly or through a cloud application
    • Key Features of Azure Tables
    • Relationship between accounts, Tables, and entities
    • Efficient Inserts and Updates
    • Designing for scale
    • Query Design and Performance
    • Understanding Partition Keys
    • How data is partitioned
    • Coding considerations
    • Azure Table Query Concepts
    • Understanding TableServiceEntity/TableServiceContext
    • Additional Resources

When to use Azure tables

image

  1. These are some typical use case scenarios for using Azure tables.
  2. Azure tables are optimized for capacity and performance (scale)

Azure Tables : When Appropriate

image

  1. SQL Database is limited currently to 150 GB without federation. Federation can be used to increase the size beyond 150 GB.
  2. If your code requires strong relational semantics, Azure tables are not appropriate. They don't allow for join statements.
  3. You can think of Azure tables as nothing more than a collection of objects. Note that each entity (similar to a row in a table) could have different attributes. In the diagram above, the second entity does not have a city property.
  4. One of the beauties of Azure Tables is that your can replicate across data centers, aiding in disaster recovery.

Tables: A collection of entities

image

  1. A table is a collection of entities.
  2. An entity is like an object. It has name/value pairs.
  3. An entity is kind of like a row in a relational database table, with the caveat that entities don't need to have the exact same attributes.

Accessing Azure Table Storage From Azure

image

  1. Any application that is capable of http is capable of communicating with Azure tables. That is because Azure tables are REST-based. This means a Java or PHP application can directly perform CRUD (create, read, update, delete) operations on an Azure Table.

Accessing Azure Table Storage From Azure

image

  1. Azure cloud applications can be hosted in the same data center as the Azure Table Storage. The compelling point here is that the latency from the cloud application is very low and can read and update the data at very high speeds.

Features: Azure Table Storage

image

  1. One of the key features of Azure tables is the low cost. You can use the Pricing Calculator to determine your predicted costs at https://www.windowsazure.com/en-us/pricing/calculator/
  2. It is important to remember that Azure tables are non-relational and therefore joins are not possible.
  3. Azure tables can automatically span multiple storage nodes, maintaining performance. This is based on the partition key that you define. It is very important to consider the partition key carefully as it determines performance.
  4. Transactions can occur only within Partition Keys. This is another example of why you must carefully consider Partition Keys.
  5. The data is replicated 3 times, including alternate data centers.

Relationships among accounts, tables, and entities

image

  1. Note that an account can have multiple tables and that each table can have one or more entities.
  2. Note the URL that is used to access your tables. This is the URL that any client that is http-capable can use.

Efficient Inserts and Updates

image

  1. Special semantics are available to make inserts and updates efficient. The bottom line is that you can do either an update or insert in just one operation.

Designing For Scale

image

  1. The Partition Key and RowKey are required properties for each entity. They play a key role on how the data is partitioned and scaled. They also determine performance for various queries. As mentioned previously, they also play a role in transactions (transactions cannot span Partition Keys).
  2. How to issue efficient queries will be addressed later in this post.

Query Design & Performance

image

  1. Performance is always an important consideration. The spectrum of speed varies considerably, depending on the type of query you issue. Specific examples are provided later in this post.

Understanding Partition Keys

image

  1. This slide illustrates how your entities get distributed across partition nodes. Note that the partition key determines how data is spread across storage nodes.

How Data is Partitioned

image

  1. The key point here is that every entity is uniquely identified by the combination of partition key and row key. You can think of partion key and row key together being similar to a primary index in a relational table.

How data is partitioned

015

  1. Azure will automatically manage both the partitioning and the replication of your entities. I am trying to emphasize how important it is to consider the partition key and row key.

Coding Considerations

image

  1. Note that Query 1 is fast because it performs and exact match on partition key and row key. It only returns one entity.
  2. Query 2 is slower than Query 1 because it does a range-based query.
  3. Query 3 is slower than Query 2 because it doesn't leverage the row key.

Azure Table Query Concepts

image

  1. Queries 4 and 5 are very slow because they don't use the partition key. This is equivalent to a full table scan with SQL Server. You want to avoid this at all costs. You may need to re-consider your partition keys and row keys if you find yourself issuing these type of queries.
  2. You may even want to keep duplicate copies of your data in other tables that are optimized for certain types of queries.

Understanding TableServiceEntity/TableServiceContext

image

  1. The table above stores email addresses. The partition key is the domain part of the email address and the mailname is the row key.
  2. TableServiceEntity and TableServiceContext are used when programming with C# or Visual Basic. By deriving from TableServiceEntity you can define your own entities that get stored in tables. TableServiceContext is used when you wish to perform CRUD operations on tables and is not illustrated here.

Additional Resources

image

  1. The Windows Azure Training Kit is the best way to get up and running.
  2. One of the labs is called Exploring Windows Azure Storage. It provides excellent examples on using storage.
  3. It can be found here (once you install the training kit) C:\WATK\Labs\ExploringStorage\HOL.htm

Thanks..
I appreciate that you took the time to read this post. I look forward to your comments.