Connecting to the Drill Cluster from a Client App

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud.

Drill supports two kinds of client connections:

  1. Direct Drillbit Connection
  2. ZooKeeper Quorum Connection

When I read the Drill documentation, it seems the ZooKeeper Quorum Connection (aka a Random Drillbit Connection) is the preferred connection type. With this connection type, you point the JDBC/ODBC driver to the members of your ZooKeeper ensemble (over TCP port 2181), information about the nodes in the Drill cluster are returned, and the driver then connects to one of the Drill nodes (over TCP port 31010). To spread the workload between the Drill nodes, the node to which you connect leverages information held by ZooKeeper to enlist the other Drill nodes in the query. The Drill documentation presents this workflow as follows:

ODBC to Quorum

This works fine so long as the ODBC/JDBC driver is being employed from within the Azure Virtual Network (VNet) but falls apart when the driver is being used outside the VNet.  VMs deployed within an Azure VNet are assigned a private (non-routable) IP address. This is the IP address that is recorded in ZooKeeper by Drill.  When ZooKeeper returns a list of nodes in the Drill cluster to the ODBC/JDBC driver, these nodes are identified by their private IP addresses and therefore are not reachable by the client that is outside the VNet.

With the Direct Connection, we by-pass the initial request for information from ZooKeeper and just go straight at a Drill node to establish the connection. That Drill node speaks to ZooKeeper to understand which other servers it might enlist to resolve the query, and because these servers are within the VNet, the private IP addresses returned by ZooKeeper are perfectly accessible.

The problem with the Direct Connection method is that my application now is dependent upon a specific Drill node for connectivity, which creates a single point of failure for the client app. Luckily, Azure provides me with a built-in Load Balancer which I can target for my connection requests and which will then distribute those requests to the various nodes in the Drill cluster. The Load Balancer even has the ability to probe the Drill nodes in my cluster to determine their health.

Given all of this, I will setup an Azure Load Balancer with a public-facing name which will re-direct connection requests (on TCP port 31010) to the four VMs in my Drill cluster. I will configure the Load Balancer to probe these VMs via HTTP on port 8047, i.e. the Drill Web Console, to determine their health.  Once configured, I will use the ODBC Driver for Drill installed on my local PC to connect to my cluster through the Load Balancer.

NOTE The documentation on the installation, configuration and use of the ODBC Driver for Drill is very well done.  I will not cover this in this post.

NOTE If you configure Drill to employ HTTPS for the Web Console, you will need to employ TCP and not HTTP against port 8047 when you define the probe.  Azure Load Balancer currently does not support HTTPS probes.

The setup of the Load Balancer is addressed in this script.  As before, this script is provided purely for educational purposes and does not come with any warranties or guarantees.

If you read the PowerShell script provided earlier in this series to demonstrate the deployment of the cluster environment, then the basic structure of the script should be very familiar.  The key thing to understand about the script is the general flow of the steps:

In the first part of the script, I define a public IP address with a friendly name for the Load Balancer.  This name plus the location into which I am deploying it will form first and second parts of the fully qualified name of the Load Balancer, i.e. drillcluster001.westus.cloudapp.azure.com per the settings in the sample script.

I then define a backend pool for the Load Balancer. This pool is what I will associate the NICs on my Drill VMs with so that the Load Balancer knows to which devices to route communications, but that assignment won't occur until the second to last block in the script.  For now, I'm just creating the pool.

Next, I define the probe the Load Balancer will use to test the Drill nodes assigned to its backend pool.  Its a very simple HTTP probe that will reach out to the Drill Web Console on TCP port 8047 on each node and request the status page.  If a page is returned, i.e. an HTTP 200 code is returned, the node will be considered healthy.  Not too sophisticated but good enough for now.

Then I create the rule that states that the Load Balancer is to route TCP requests on port 31010 to the same port on the devices in my backend pool.

With all these elements in place, I create the Load Balancer and assign all these items to it.

At this point, I have a Load Balancer setup but no devices associated with it that would absorb the traffic that it will route.  To fix this, I grab the NICs associated with my Drill VMs and update each of these to tie them to the backend pool on the Load Balancer.  You may notice that at the top of this block, I update my reference to the backend pool; I do this because the creation of the Load Balancer modified the definition of the object created earlier.

At this point, I am ready to connect to Drill through the Load Balancer.  You may have noticed that I have not modified the Network Security Groups associated with any of my Drill VMs.  This is because the Azure Load Balancer is able to communicate with the VMs using their private IP addresses. The Network Security Groups are used to determine which ports should be accessible from the public IP space. Keeping ports 31010 and 8047 closed on the Network Security Groups would force all incoming Drill traffic to go through the Load Balancer.  I could remove the one rule that I created earlier to allow traffic on TCP port 8047 to reach dr004 and move this into the Load Balancer if I wanted to tighten up communications security.

Now I am just about ready to connect to Drill through the Load Balancer.  The last thing I need to do is to open up the Network Security Groups associated with my VMs to accept inbound traffic on TCP port 31010. The documentation on the Azure Load Balancer talks quite a bit about how the Load Balancer talks to the NICs on these systems on their internal IP addresses, but clients connecting through the Load Balancer are redirected to the public IPs.  The Network Security Groups associated with these will continue to block traffic unless I open up the targeted ports.

As stated earlier, the ODBC Driver for Drill is pretty straightforward to install and setup.  The key thing to point out for the test is to ensure that I chose the Direct Drillbit Connection type and to point the driver to the fully-qualified name of the Load Balancer.