Bare Metal - MPI library on Azure Linux nodes

This post was published to Ninad Kanthi's WebLog at 21:33:08 10/07/2015

Bare Metal - MPI library on Azure Linux nodes

 

 

Step-By-Step guide

 

This article shows how a MPI applications can be setup using “bare metal” Linux nodes on Azure.

It must be emphasized that this article shows how easy it is to setup and configure the Azure environment to execute the MPI application but the configurations shown here are not recommended for a production environment.  It is assumed that the reader has basic knowledge of Azure and medium to advanced knowledge on Linux, especially Ubuntu distros.

The techniques shown here have been extended from this Ubuntu article.

1.      Creating a Cluster

As the first step, we will configure a Linux cluster under Azure. It will consist of four nodes and one of them will be the master node. The cluster will be created inside a VPN. To keep things very simple, we won’t create a DNS server and instead modify the /etc/hosts file.

The script for creating the Linux cluster is shown at the bottom of this article. Couple of things to note in the script are:

-          We create Linux cluster from Ubuntu-14_04_2_LTS. This image is available from the Azure gallery.

-          We will use NFS to created shared storage between the Linux nodes. In order for the NFS to be successful, required ports, 2049 and 111 are configured during the provisioning of the nodes.

After successful execution of creation script, you should see the Linux nodes configured under the VNET in your subscription as shown below

image5.jpg

Figure 1: Linux nodes under Azure VPN

After the Linux nodes have been provisioned and up and running, we use PuTTY to establish a SSH connection and login to the node.

NOTE: The process of accessing and logging on to the provisioned Linux nodes is described in detailed here

We will use the Linux node, linuxdistro-001, as the master node.

After logging onto the node, we edit the /etc/hosts file and add the nodes TCP endpoints to this file. This step is replicated across all nodes.

linuxuser@linuxdistro-001:~$ sudo vi /etc/hosts

After editing, the content of /etc/hosts file should look like following:

linuxuser@linuxdistro-001:~$ cat /etc/hosts

127.0.0.1 localhost

10.0.0.4 ub0

10.0.0.5 ub1

10.0.0.6 ub2

10.0.0.7 ub3

After repeating the above step across each node, each node can communicate with each other without the DNS being provisioned but please bear in mind that DNS is always a better option.

As we need to execute same commands across the Linux nodes, we will install ssh & pssh utilities on the master node.

Install pssh on the master node

linuxuser@linuxdistro-001:~$ sudo apt-get install ssh

linuxuser@linuxdistro-001:~$ sudo apt-get install pssh

 

Test that pssh is working correctly

linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" echo hi

 

Note: To supress SSH warnings, we add the options x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" to every pssh  command

2.      Provisioning shared Storage

 

We will create /mirror folder under the master node and this will be shared across the nodes via NFS

Install nfs-client on all nodes except the master node

linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -H 10.0.0.7 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo apt-get install -y nfs-client

Create same folder structure on all nodes

linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -h host_file -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo mkdir /mirror

Install the nfs-server on the master node

linuxuser@linuxdistro-001:~$ sudo apt-get install nfs-server

Ensure that the /mirror folder is set to share

linuxuser@linuxdistro-001:~$ echo "/mirror *(rw,sync)" | sudo tee -a /etc/exports

Restart the nfs share on the master node

linuxuser@linuxdistro-001:~$ sudo service nfs-kernel-server restart

Mount the share across the client nodes

linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -H 10.0.0.7 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo mount ub0:/mirror /mirror

Test the share across the client nodes.

linuxuser@linuxdistro-001:~$ sudo cp /etc/hosts /mirror

linuxuser@linuxdistro-001:~$ parallel-ssh -i -A -H 10.0.0.5 -H 10.0.0.6 -H 10.0.0.7 -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo ls /mirror

3.      Setting up mutual trust for non-admin user

We will create a non-admin user – mpiu. This user needs to be created because Ubuntu does not support root trust. We also assign our shared folder, /mirror, as the home folder for this user. newusers command is used to create this user across all nodes.

Note: We use the newusers command to create the user across the various nodes as it provides the ability to create execute the command in non-interactive mode. The parameters for the user mpiu is specified in a file – /mirror/userfile

Creating non-admin user

linuxuser@linuxdistro-001:~$ cd /mirror

linuxuser@linuxdistro-001:/mirror$ vi userfile

linuxuser@linuxdistro-001:/mirror$ cat userfile

mpiu:<password removed>:1002:1000:MPI user:/mirror:/bin/bash

linuxuser@linuxdistro-001:/mirror$ parallel-ssh -i -A -h host_file -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo newusers /mirror/userfile

Change the owner of shared folder to newly created user.

linuxuser@linuxdistro-001:/mirror$ sudo chown mpiu /mirror

Configuring password-less SSH communication across

1.       Install ssh components on the master node

linuxuser@linuxdistro-001:~$ sudo apt-get install ssh 

2.       Next, Login with our newly created user

linuxuser@linuxdistro-001:/mirror$ su - mpiu 

3.       Generate an RSA key pair for user mpiu

mpiu@linuxdistro-001:~$ ssh-keygen –t rsa 

4.       Add this key to authorized keys file

mpiu@linuxdistro-001:~$ cd .ssh

mpiu@linuxdistro-001:~$ cat id_rsa.pub >> authorized_keys

5.       As the home directory is common across all nodes (/mirror), there is no need to run these commands on all nodes

Test SSH run – it should not ask you for password for connecting to the machine.

mpiu@linuxdistro-001:~$ ssh ub1 hostname

4.      MPI (HPC Application) installation and configuration

1.       Login as the admin user

We logout from our existing mpiu session and we should be back in our admin – linuxuser – session.

mpiu@linuxdistro-001:~$ logout

linuxuser@linuxdistro-001:

2.       Install GCC

We need a compiler to compile all the code. This needs to happen only on the master node.

linuxuser@linuxdistro-001: sudo apt-get install build-essential

 

3.       Install MPICH2

The MPICH2 structure folder structure needs to be same across all nodes, therefore we execute the following command across all nodes/

linuxuser@linuxdistro-001: parallel-ssh -i -A -h /mirror/hosts_file -x "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o GlobalKnownHostsFile=/dev/null" sudo apt-get install -y mpich2

 

Note the content of /mirror/hosts_file is

linuxuser@linuxdistro-001:~$ cat ./host_file

10.0.0.4

10.0.0.5

10.0.0.6

10.0.0.7

 

4.       Test installation

linuxuser@linuxdistro-001:/mirror$ su – mpiu

mpiu@linuxdistro-001:~$ which mpiexec

mpiu@linuxdistro-001:~$ which mpirun

 

5.       Setting up machine file

Create a configuration file – nodefile – in mpiu’s home folde. Within the file specify the node names followed by the number of processes to spawn on each node.  We have two cores per node, so we are going to specify 2 or 1 process per node.

 

mpiu@linuxdistro-001:~$ cat nodefile

ub0

ub1:2

ub2:2

ub3

mpiu@linuxdistro-001:~$

 

6.       Create a testing MPICH2 C source code and compile.

 

Dummy mpi_hello.c program.

mpiu@linuxdistro-001:~$ cat mpi_hello.c

#include <stdio.h>

#include <mpi.h>

 int main(int argc, char** argv) {

    int myrank, nprocs;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);

    MPI_Comm_rank(MPI_COMM_WORLD, &myrank);

    printf("Hello from processor %d of %d\n", myrank, nprocs);

    MPI_Finalize();

    return 0;

}

                Compile the source code into an executable

mpiu@linuxdistro-001:~$ mpicc mpi_hello.c -o mpi_hello

5.      Job Execution

mpiu@linuxdistro-001:~$ mpiexec -n 6 -f nodefile ./mpi_hello

6.      Results capture and analysis

If everything is successful, you should see the following result.

mpiu@linuxdistro-001:~$ mpiexec -n 6 -f nodefile ./mpi_hello

Hello from processor 5 of 6

Hello from processor 0 of 6

Hello from processor 3 of 6

Hello from processor 1 of 6

Hello from processor 4 of 6

Hello from processor 2 of 6

mpiu@linuxdistro-001:~$

7.      Clean the environment

 

One of the biggest advantage of cloud computing is to utilise the pay-as-you-go features of the service. What that means is that once the experiment and exercise is over, the environment could be completely torn-down and it costs nothing to the end-user.

Execute the script shown here to clean the Azure environment.

Appendix

Script - Provisioning the Linux cluster

#CreateLinuxVirtualMachines - In NON AVAILABILITY GROUPS

#azure config mode asm

# The Subscription name that we want to create our environment in

$SubscriptionNameId = "<enter your subscription id here>"

# Storage account name. This should BE CREATED BEFORE EXECUTING THE SCRIPT

$StorageAccountname = "ubuntuimages"

# AffinityGroup name. This should BE CREATED BEFORE EXECUTING THE SCRIPT

$AffinityGroupName = "linuxdistros"

# Network name. This should BE CREATED BEFORE EXECUTING THE SCRIPT

$VnetName = "nktr21"

# Subnetname. This should BE CREATED BEFORE EXECUTING THE SCRIPT

$SubnetName = "linuxcluster"

# Availability Set

$AvailabilitySetName = "linuxdistro"

# Cloud Service name. This service will be created

$CloudServiceName = "linuxcloudsrv"

# Instance size

$InstanceSize = "Medium"

# Linux Admin Username

$password = "<yourpassword>"

# Linux Admin Password

$username = "linuxuser"

# Name Prefix of the VM machine name. Numeric counter number is appended to this to create the final

$LinuxImageNamePrefix = "linuxdistro-00"

# Load the keys for us to login to

$LoadSettings = Import-AzurePublishSettingsFile "NinadKNew.publishsettings"

# Important to specify the CurrentStorageAccountName, especially otherwise you might get Storage not accessible error when creating linux machines.

Set-AzureSubscription -SubscriptionId $SubscriptionNameId -CurrentStorageAccountName $StorageAccountname -ErrorAction Stop

Select-AzureSubscription -SubscriptionId $SubscriptionNameId -ErrorAction Stop

# Get the image from Azure repository that we want to create. Its the Ubuntu 14_04_LTS variant. We can speed up the creation script by caching the name.

# the name thus obtianed is 'b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-14_04_2_LTS-amd64-server-20150309-en-us-30GB'

#$UbuntuImage = Get-AzureVMImage | where {($_.ImageName -match 'Ubuntu-14_04') -and ($_.ImageName -match '2015') -and ($_.ImageName -match 'LTS') -and ($_.Label -eq 'Ubuntu Server 14.04.2.LTS')} | Select ImageName

#Write-Host $UbuntuImage[0].ImageName

# Guid of the image name that we want to create

$ImageNameGuid = "b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-14_04_2_LTS-amd64-server-20150309-en-us-30GB"

# If the service does not exist, we'll create one.

$GetCloudService = Get-AzureService -ServiceName $CloudServiceName -Verbose -ErrorAction Continue

if (!$GetCloudService )

{

    # service does not exist.   

    $CreateCloudService = New-AzureService -ServiceName $CloudServiceName -Label "Created from Windows power shell" -Description '16 June 2015'

    Write-Host ("Return value from creating the cloud service = {0}" -f $CreateCloudService.OperationStatus.ToString())

}

$PortmapPort = 2049

$NfsPort = 111

$Portcounter = 0

$Counter = 1

# Loop to create the Linux machines.

do

{

    # Prepend the VM name

    $LinuxImagename = $LinuxImageNamePrefix + $Counter.ToString()

    # Configure VM by specifying VMName, Instance Size, ImageName, and specify AzureSubnet

    $VMNew = New-AzureVMConfig -Name $LinuxImagename -InstanceSize $InstanceSize -ImageName $ImageNameGuid

    # Add the username and password to the VM creation configuration

    $VMNew | Add-AzureProvisioningConfig -Linux -LinuxUser $username -Password $password | Set-AzureSubnet $SubnetName

    # Create and start the VM. Remember, it won't be fully provisioned when the call returns.   

    $Result = $VMNew | New-AzureVM -ServiceName $CloudServiceName -AffinityGroup $AffinityGroupName -VNetName $VnetName

    Write-Host ("Created VM Image {0}, return value = {1}" -f $LinuxImagename, $Result.OperationStatus )

    $Counter++

    $Portcounter++;

}

while ($Counter -le 4)

 

Script - Removing the Linux cluster

# CleanLinuxVirtualMachines

# The Subscription name that we want to create our environment in

$SubscriptionNameId = "<your subscription id>"

# Cloud Service name. This service will be created

$CloudServiceName = "linuxcloudsrv"

# Name Prefix of the VM machine name. Numeric counter number is appended to this to create the final

$LinuxImageNamePrefix = "linuxdistro-00"

# Load the keys for us to login to

Import-AzurePublishSettingsFile "NinadKNew.publishsettings"

Set-AzureSubscription -SubscriptionId $SubscriptionNameId -ErrorAction Stop

Select-AzureSubscription -SubscriptionId $SubscriptionNameId -ErrorAction Stop

$Counter = 1

$AllVmsStopped = 1

$OkToRemoveVM = 0

do

{

    $OkToRemoveVM = 0

    # Prepend the VM name

    $LinuxImagename = $LinuxImageNamePrefix + $Counter.ToString()

    Write-Host ("VM Image name = {0}" -f $LinuxImagename)

    $VMPresent = Get-AzureVM -ServiceName $CloudServiceName -Name $LinuxImageName

    if ($VMPresent)

    {

        if ($VMPresent.Status -eq "StoppedVM" -or $VMPresent.Status -eq "StoppedDeallocated" )

        {

            Write-Host ("{0} VM is either stopped or StoppedDeallocated" -f $LinuxImagename )

            $OkToRemoveVM =1

        }

        else

        {

            Write-Host ("[Stopping] VM {0}" -f $VMPresent.Name)

            $StopVM = Stop-AzureVM -Name $VMPresent.Name -ServiceName $VMPresent.ServiceName -Force

            if ($StopVM.OperationStatus.Equals('Succeeded'))

            {

                $OkToRemoveVM =1

            }

            else

            {

                Write-Host ("Not able to stop virtual machine {0}, cloud service will not be removed" -f $VMPresent.Name)

                $AllVmsStopped = 0  

            }

        }

        if ( $OkToRemoveVM -eq 1)

        {   

            Write-Host ("[Removing] virtual machine {0}" -f $VMPresent.Name)

            Remove-AzureVM -Name $VMPresent.Name -ServiceName $VMPresent.ServiceName -ErrorAction Continue

        }

    }

    else

    {

        Write-Warning ("No Vm found " -f $LinuxImageName)

    }

    $Counter++

}

while ($Counter -le 4)