The Deployment Mechanics for the Drill Infrastructure

NOTE This post is part of a series on a deployment of Apache Drill on the Azure cloud.

In my last post, I described the topology of a Drill deployment in Azure.  In this post, I will present a PowerShell script for the deployment of that topology. If you are more comfortable with BASH scripting, you might wish to re-create this script (or some portions of it) using the Azure CLI.  Still, the PowerShell script is useful for understanding all the steps involved with setting the solution up.

If you have used Azure, you are probably familiar with the Azure Portal UI.  Everything I do in this script can be done through that UI.  The only reason I am using a script is that I want to ensure the components of the solution are deployed exactly as planned and with consistency.

NOTE The PowerShell script in this post is for educational purposes only. It comes with no guarantees or warranties.

You can download the full script here.  It deploys everything into a Resource Group I call drill. I do this simply so that I can track expenses relative to this specific deployment and so that I can delete the entire deployment by simply deleting the drill Resource Group.

The name of the Resource Group is defined by the $resourceGroupName variable in the block of variables at the top of the script.  In that block you will find several other variables that control where the components are deployed ($regionName) and into which of the several Azure subscriptions with which my login is associated these components will be placed ($subscriptionName).

There are many, many other variables in the script that you should review but I need to make special mention of $storageAccountName & $diagStorageAccountName. These variables hold the globally unique names of the Premium Storage Account into which my VM's OS & Data Disks (vhds) will land and the Standard Storage Account into which my VM diagnostics will land.  The script will attempt to create both of these when executed.

You should also note the "base name" variables, $zkVmBaseName & $drVmBaseName.  These hold the prefix to the ZooKeeper & Drill VM names.  In my topology, I will have three ZooKeeper VMs.  Using the $zkVmBaseName variable with an assigned value of "zk", these VMs will be assigned names of  "zk001", "zk002" & "zk003".  As these names will be assigned both to the VM for its internal name and its public name, you should come up with base names that are likely to be globally unique.

As you review the script, take note of the use of Login-AzureRmAccount.  This will trigger a prompt for you to login to your Azure account.  In the block just after this, I trigger a prompt to capture a user name and password for the SSH account to be created in all the VMs in this topology.

Finally, before attempting to execute this script, make sure you not only have the Azure PowerShell modules installed but that they are up to date.  At the top of the script, I have provided comments which include the PowerShell commandlets for installing/updating these on your system.