Apache HBase is an open source NoSQL database that provides real-time read/write access to large data-sets. Facebook’s Message infrastructure, Apples Siri, Bloomberg’s price history service and world’s largest biometric identity system in india called Aadhar all running on HBase.
HBase has fantastic track record of being very successful for highest level of data scale needs, but many people don’t realize that It’s a complicated data store with a multitude of levers and knobs that requires to be adjusted to tune performance and achieve scale. Also, compute and hdfs storage are coupled together very tightly which means that you can’t really scale memory and processing independently of each other.
HDinsight makes HBase even more great in following ways
Out of the box highly tuned HBase cluster in minutes In Azure , several large customers runs their mission critical HBase workloads, over the period of time services becomes more and more intelligent about right configurations for running the HBase workloads as efficiently as possible. This intelligence is then brought to you in the form of highly tuned clusters that will meet your needs. You can create clusters within minutes manually with Azure Portal or by automating the creation workflow with Azure JSON templates , Powershell, REST based API or Azure client SDK.
Decoupled Storage and CPU HDInsight changes the game with seemingly simple , yet very powerful cloud construct where compute and CPU are decoupled. This is very powerful as you have inexpensive abundant cloud storage that could be mounted to a smallest HBase cluster. When you don’t need to read/write, you can delete the cluster completely and still retain the data. This flexibility helps our customers achieve best price/performance.
Delivered as a service , yet not compromising on control HDInsight delivers HBase as a service so you don’t have to worry about setup, patching , upgrading , maintaining etc. moreover, you get financially backed SLA of 99.9% as well as support, yet it doesn’t take away any control. You have the option to further fine tune your cluster as well as install additional components and make further customizations.
Best suited for mission critical production workloads As Microsoft’s roots are in enterprises, you will find HBase fitting very nicely into your enterprise architecture. You can host HBase clusters in a private virtual network in order to protect your valuable data. You can take advantage of Azure infrastructure to achieve high availability and disaster recovery. You can also find Azure and HDInsight constantly update their compliance status @ Azure Trust Center
You can get started with HDInsight here
In coming posts I will talk about use cases, tips tricks and best practices for running HBase workloads in HDInsight.