Azure Batch Shipyard https://azure.github.io/batch-shipyard/

Azure Batch enables you to run large-scale parallel and high-performance computing (HPC) applications efficiently in the cloud. It's a platform service that schedules compute-intensive work to run on a managed collection of virtual machines, and can automatically scale compute resources to meet the needs of your jobs.+

With the Batch service, you define Azure compute resources to execute your applications in parallel, and at scale. You can run on-demand or scheduled jobs, and you don't need to manually create, configure, and manage an HPC cluster, individual virtual machines, virtual networks, or a complex job and task scheduling infrastructure

image

https://azure.microsoft.com/en-us/services/batch/

Azure Batch Shipyard is a tool to help provision and execute batch-style Docker workloads on Azure Batch compute pools. No experience with the Azure Batch SDK is needed; run your Dockerized tasks with easy-to-understand configuration files!

Images

Use the following links to quickly navigate to the Azure Shipyard Image/recipe collections:

  1. Benchmarks
  2. Computational Fluid Dynamics (CFD)
  3. Deep Learning
  4. Molecular Dynamics (MD)
  5. Video Processing

Major Features

  • Automated Docker Host Engine installation tuned for Azure Batch compute nodes
  • Automated deployment of required Docker images to compute nodes
  • Accelerated Docker image deployment at scale to compute pools consisting of a large number of VMs via private peer-to-peer distribution of Docker images among the compute nodes
  • Comprehensive data movement support: move data easily between locally accessible storage systems, Azure Blob or File Storage, and compute nodes
  • Docker Private Registry support
  • Automatic shared data volume support
  • Seamless integration with Azure Batch job, task and file concepts along with full pass-through of the Azure Batch API to containers executed on compute nodes
  • Support for Azure Batch task dependencies allowing complex processing pipelines and DAGs with Docker containers
  • Transparent support for GPU accelerated Docker applications on Azure N-Series VM instances (Preview)
  • Support for multi-instance tasks to accommodate Dockerized MPI and multi-node cluster applications on compute pools with automatic job completion and Docker task termination
  • Transparent assist for running Docker containers utilizing Infiniband/RDMA for MPI on HPC low-latency Azure VM instances:
    • A-Series: STANDARD_A8, STANDARD_A9
    • H-Series: STANDARD_H16R, STANDARD_H16MR
    • N-Series: STANDARD_NC24R (not yet available)
  • Automatic setup of SSH users to all nodes in the compute pool and optional tunneling to Docker Hosts on compute nodes

Installation

Installation is typically an easy two-step process. The CLI is also available as a Docker image: alfpark/batch-shipyard:cli-latest. Please see the installation guide for more information regarding installation and requirements.

Documentation

Please refer to the Batch Shipyard Guide for a complete primer on concepts, usage and a quickstart guide.

Please visit the Batch Shipyard Recipes for various sample Docker workloads using Azure Batch and Batch Shipyard after you have completed the introductory sections of the Batch Shipyard Guide.

Batch Shipyard Compute Node OS Support

Batch Shipyard is currently only compatible with Azure Batch supported Marketplace Linux VMs.

Resources

Azure Batch Shipyard Docker Images for Computational Fluid Dynamics (CFD), Deep Learning.  Molecular Dynamics (MD), Video Processing  https://github.com/Azure/batch-shipyard/tree/master/recipes

Azure Batch Samples https://github.com/Azure/azure-batch-samples

Technical Overview of Azure Batch - https://docs.microsoft.com/en-us/azure/batch/batch-technical-overview