I have a customer interested in leveraging Apache Drill for interactive queries on data resident in the Azure cloud. In developing a solution for this customer, I have found the Apache Drill documentation to be a bit sparse so that it is my hope that what I post in this series may be of assistance to others interested in doing something similar.
NOTE This series addresses Apache Drill 1.6, Apache ZooKeeper 3.4.8 & the Microsoft Azure Cloud as of May 27, 2016.
The key topics I will address in this series of posts are:
- An Overview of an Apache Drill Topology in Azure
- The Deployment Mechanics for the Azure Infrastructure
- Configuration of the ZooKeeper Ensemble
- Configuration of the Drill Cluster
- Configuration of Azure Blob Storage (aka WASB) as a Drill Data Source
- Configuration of Azure SQL Database as a Drill Data Source
- Configuration of Hive on Azure HDInsight as a Drill Data Source
- Configuration of HBase on Azure HDInsight as a Drill Data Source
- Connecting to the Drill Cluster from a Client App
Before diving into these topics, I would like to acknowledge the many, many folks I pestered as I found myself stuck at various junctures. Thank you all for your assistance.