Building Scalable Database Solutions Using SQL Azure – Scale-out techniques such as Sharding or Horizontal Partitioning

As I spend time at conferences and customer event, the top faq I get has to be this one; SQL Azure is great but a single database is limited in size and computational capacity. How do I build applications that need larger computational capacity? My answer is; You build it the same way most application are built in the cloud ! That is by using scale-out techniques such as horizontal partitioning or commonly referred to as sharding… And here is my rationale;

Rewinding back to the top…

When building large scale database solutions, there are a number of approaches that can be employed.

Scaling-Up refer to building apps using a single large unified HW and typically single database that can house all the data of an app. This approach works as long as you are able to find hardware that can handle the peak load and you are ok with typically exponential incremental cost for a none linear increase in scalability. Scale-up class of hardware typically have high administration cost due to its complex configuration and management requirements.

Scaling-Out refer to building apps using multiple databases spread over multiple independent nodes. Typically nodes are cost effective, commodity class hardware. There are multiple approaches to scale-out but among alternatives, patterns such as sharding and horizontal partitioning provides the best scalability. With these patterns, apps can decentralize their processing, spread their workload to many nodes and harness the collective computational capacity. You can achieve linear cost to scalability ratio as you add more nodes.

How do you shard an app?

You will find many definition and approaches when it comes to sharding but here are the two principals I’ll recommend for best scalability characteristics.

1. Partition your workload to independent parts of data. Let’s call single instance of this data partition an atomic unit. The atomic unit can be a tenants data in a multi-tenant apps or a single customer in a SaaS app, a user in a web app or store for a branch enabled app etc. Atomic unit should be the focus of majority of your app workload. This boils down to ensuring that transactions are scoped to atomic units and the workload mostly filters to an instance of this atomic unit. In this case of a multi-tenant app for example, the tenant_id, user_id, store_id, customer_id etc is present in most interactions of the app with the database such as the following query;

 SELECT … FROM … WHERE … AND customer_id=55 OR UPDATE … WHERE … AND customer_id=55

2. Build elastic apps that understand that data is partitioned and has robust dynamic routing to the partition that contains the atomic unit. This is about developing app that discover just-in-time at runtime, where a given atomic unit is located and do not tie to a specific static distribution of data. This typically entails building apps that cache a directory of where data is at any given time.

Clearly, the two requirements above place some additional complexity on the developers and administrators but they result in great scalability and price-performance characteristics at the end. Once you have built the app with these principals, now app administrators can capacity plan flexibly based on the load they expect. In the inception of the app, maybe a few databases are enough to handle all the traffic to 100s of atomic units. As your workload grow such as more tenants, more traffic per tenant or larger tenants etc, you can provision new databases and repartition your data. If your workload shrink, again you can repartition your data and de-provision existing databases.

Why is SQL Azure the best platform for sharded apps?

1. SQL Azure will help reduce administrative complexity and cost. It eliminates physical administration from network management to maintenance of OS and SQL binaries. No patching or VMs to maintain... Built-in HA provides the 99.9% SLA for SQL Azure databases. Scalability features such as the SQL Azure load balancing help utilize capacity of the cluster. It all adds up to administrators focusing on running applications not the operating environment.

2. SQL Azure provides great elasticity with easy provisioning and de-provisioning of databases. So you don’t need to but HW, wire it up, install Windows and SQL on top, you can simply run “CREATE DATABASE” and be done. Better yet you can run “DROP DATABASE” and no longer incur a cost.

Given the benefits, many customers using SQL Azure service build large scale database solution like multi-tenant cloud apps such as Exchange Hosted Achieve service or internet scale apps such as TicketDirect that require massive scale on the web.

This is the way life is today. In the next few weeks with PDC and PASS 2010, we will be talking about features that will make SQL Azure even a better platform for sharded apps, enhancing the lives of both developers and DBAs. Stay tuned to PDC and PASS and to my blog. I’ll be posting details on the features here as we unveil the functionality at these conferences.

See you soon!