Strategies to realizing Cost Savings in Azure

 This is a long post. Please scroll down

Table of Contents

Introduction. 2

Choosing the appropriate Compute Option in Azure. 3

Planned (Pro-Active) approach to Compute Optimization. 3

Laying down the scaffolding/structure so VMs are provisioned with cost optimization guidelines. 3

Governance. 4

Naming standards. 5

Using Role Based Access Control (RBAC) 6

Using Azure Policies. 6

Review Resource Group (RG) Costs periodically to mine opportunities for cost reduction. 7

Making use of Tactical platform capabilities. 7

Azure Hybrid User Benefit 7

Use Reserved Instances (RI) 9

Shutdown. 10

Right Sizing. 11

Azure Low Priority VMs. 11

Scaling (Auto) 11

Performance optimization/performance POV. 12

Choose the right storage. 12

Choose the right type of VMs. 13

Choosing the appropriate platform construct 13

REACTIVE Approach - things to do post deployment 14

Azure Advisor. 14

Cloudyn type of tools. 14

Check if deployments can be converted for Reserved Instance(RI) 15

Check if deployments can be converted for AHUB. 1

Study the entities within the RG.. 15

Azure Compute VMs – Capability and Design Guidance. 15

VM Capability Guidance. 16

Azure VM Design Guidance. 22

Usage Analysis Tools. 23

 

 

Introduction

 

This document provides a framework/structured approach to consuming Azure Compute resources. At a high level this is divided into (a) Proactive approach and (b) a tactical/reactive approach to the topic at hand. It is worth noting that beyond the pro-active/reactive approaches described here, studying, and arriving at the appropriate application architecture with the platform native constructs in mind is equally if not more important in optimized usage of all resources (human, time, DevOps cycles, compute costs etc.,). Of course, this should be done much earlier in the life-cycle instead of merely lifting and shifting and then looking for opportunities to save costs. Another important dimension to Azure compute cost savings is taking an organizational view of azure services consumed, and pooling resource usage across applications. Classical examples of this are:

(a)    using SQL Elastic pools if multiple applications use Azure SQL Services

(b)    using fewer App Service Environments (ASE) but packing them to the maximum density with 100s of Azure Web Applications per ASE

 

Choosing the appropriate Compute Option in Azure

With a bewildering choice of options, the challenge is to figure out the appropriate if not the most optimum compute choice within the Azure platform for the application/solution that you want to deploy. There is indeed a bewildering choice of compute options in Azure which are wide and varied and cater to a wide variety of application OR rather processing needs. For instance, what compute should one choose for a CPU intensive, or say a memory intensive application(s)?  – In these cases, the choices are easy. What should one choose for Web applications – again there are easy choices. But what should one choose say for a combination of requirements such as (a) highly decomposed and componentized application, (b) sudden bursts, (c) high velocity for the development team when pushing code, (d) where the underlying requirement is to have the appropriate compute (not the old style VM construct) for the task at hand, (e) support cloud native constructs, (f) enable smooth DevOPS, (g) fault tolerance and resilience? This is not an easy task and both Infrastructure architects and application architects should take the time to study and understand the platform capabilities and utilize the correct Azure platform constructs for the solution deployment. Taking the easy approach of deploying to VMs should be the last resort and avoided at best, to save time and money for the long run.

The following link captures some of the well thought out decision criteria depending on the task or tasks the application or solution needs to accomplish.

Criteria for choosing an Azure Compute Option

Planned (Pro-Active) approach to Compute Optimization

Laying down the scaffolding/structure so VMs are provisioned with cost optimization guidelines

These are things mature organizations do, so that end consumers of azure resources are within well-established guidelines (guard rails). The scaffolding is essentially a framework that is a combination of governance rules (enforce-able) and best practices. The graphic below captures the essential components that are used to build such a scaffolding and bear in mind the evolving nature of the platform, as newer constructs and capabilities get added.

 

 

Governance

When moving to Azure, you must address the topic of governance early to ensure the successful use of the cloud within the enterprise. Unfortunately, the time and bureaucracy of creating a comprehensive governance system means some business groups go directly to vendors without involving enterprise IT. This approach can leave the enterprise open to vulnerabilities if the resources are not properly managed. The characteristics of the public cloud - agility, flexibility, and consumption-based pricing - are important to business groups that need to quickly meet the demands of customers (both internal and external). But, enterprise IT needs to ensure that data and systems are effectively protected.

In real life, scaffolding is used to create the basis of the structure. The scaffold guides the general outline and provides anchor points for more permanent systems to be mounted. An enterprise scaffold is the same: a set of flexible controls and Azure capabilities that provide structure to the environment, and anchors for services built on the public cloud. It provides the builders (IT and business groups) a foundation to create and attach new services.

The Azure enterprise scaffolding is based on practices we have gathered from many engagements with clients of various sizes. Those clients range from small organizations developing solutions in the cloud to Fortune 500 enterprises and independent software vendors who are migrating and developing solutions in the cloud. The enterprise scaffold is "purpose-built" to be flexible to support both traditional IT workloads and agile workloads; such as, developers creating software-as-a-service (SaaS) applications based on Azure capabilities.

The enterprise scaffold is intended to be the foundation of each new subscription within Azure. It enables administrators to ensure workloads meet the minimum governance requirements of an organization without preventing business groups and developers from quickly meeting their own goals.

Governance is crucial to the success of Azure. This article targets the technical implementation of an enterprise scaffold but only touches on the broader process and relationships between the components. Policy governance flows from the top down and is determined by what the business wants to achieve. Naturally, the creation of a governance model for Azure includes representatives from IT, but more importantly it should have strong representation from business group leaders, and security and risk management. In the end, an enterprise scaffold is about mitigating business risk to facilitate an organization's mission and objectives.

The following image describes the components of the scaffold.

The foundation relies on a solid plan for departments, accounts, and subscriptions. The pillars consist of Resource Manager policies and strong naming standards. The rest of the scaffold comes from core Azure capabilities and features that enable a secure and manageable environment.

Naming standards

The first pillar of the scaffold is naming standards. Well-designed naming standards enable you to identify resources in the portal, on a bill, and within scripts. Most likely, you already have naming standards for on-premises infrastructure. When adding Azure to your environment, you should extend those naming standards to your Azure resources. Naming standard facilitate more efficient management of the environment at all levels.

For naming conventions:

Review and adopt where possible the Patterns and Practices guidance. This guidance helps you decide on a meaningful naming standard.

Use camel Casing for names of resources (such as myResourceGroup and vnetNetworkName).

Note: There are certain resources, such as storage accounts, where the only option is to use lower case (and no other special characters).

Consider using Azure Resource Manager policies (described in the next section) to enforce naming standards.

 

Using Role Based Access Control (RBAC)

RBAC stands for Role Based Access Control where-by Azure provides various default roles for various namespaces and objects in each of these namespaces. The roles have a name and a set of allowed operations on those namespaces and as well as a set of operations that are not allowed. The general rule is anyone using Azure in any sort of capability should be in clear understood and well-defined roles which allow them to do only the specific activities they should be allowed to and nothing more and nothing less.  As a best practice every end user and administrator in Azure should only be limited to the set of roles that are absolutely required. Some of the following links are very handy in this regard.

  1. What is RBAC?
  2. Using RBAC to manage access to resources
  3. RBAC – Built-in roles
  4. RBAC – Custom roles
  5. Assign custom roles for internal and external users

Using Azure Policies

Azure Policy is another capability in the Azure platform where an enterprise can create various policies and then deploy them at an appropriate scope such as a subscription or a resource group. Based on the scope at which the policy is deployed, incoming ARM based requests to Azure get evaluated and ascertained if they conform to the policies. If they are not the incoming requests are not honored. The following are some handy links with regards to understanding and using Azure Policies.

  1. Overview of Azure Policies
  2. Azure Policy Definition structure
  3. Azure Policy Initiatives

 

Review Resource Group (RG) Costs periodically to mine opportunities for cost reduction

The best practice in terms of application deployment is to deploy the components of the solution all into a single Resource Group. The costs for each RG can be then separated out and analyzed for the following: -

(a)    Relevancy – meaning is this component required any more. If not, then delete it.

(b)    Usage and load patterns – if the usage is low or the load handled by the component is very low (or conversely high) then consolidate multiple instances of such components into lesser number of instances (or set scale up/down rules)

Making use of Tactical platform capabilities

Azure Hybrid User Benefit

One can save up to 40 percent on Windows Server virtual machines with Azure Hybrid Benefit for Windows Server. Use your on-premises Windows Server licenses with Software Assurance to save big on Azure. With this benefit, for each license we will cover the cost of the OS (on up to two virtual machines!), while you just pay for base compute costs.

Overview Reference - https://azure.microsoft.com/en-us/pricing/hybrid-benefit/

Use Reserved Instances (RI)

As the name indicates Azure Reserved Instances are those VMs that you run, but which you have reserved for long term usage. The prices for Reserved VM Instances are significantly lower than non-reserved instances and so anytime there is a need to have long running VMs (say you are sure it is going to be used for a year or so) then you should opt to use Azure RIs to gain cost savings.

Significantly reduce costs—up to 72 percent compared to pay-as-you-go prices—with one-year or three-year terms on Windows and Linux virtual machines (VMs). When you combine the cost savings gained from Azure RIs with the added value of the Azure Hybrid Benefit, you can save up to 82 percent*. Lower your total cost of ownership by combining RIs with pay-as-you-go prices to manage costs across predictable and variable workloads. What’s more, you can now improve budgeting and forecasting with a single upfront payment, making it easy to calculate your investments.

Prepay for Virtual Machines with Reserved VM Instances

The following are some useful references. To learn more about Reserved Virtual Machine Instances, see the following articles.

 

Shutdown

Of all the strategies for saving costs on compute, shutting down a VM when not needed is the most propitious one. There are various methods to accomplish this.

  1. Now there is a provision in the portal (or via ARM) to shut down a VM at a certain time every day. So, see if that will work in your scenario and just set it for the VMs where this can be applied.
  2. Shutdown a VM when a certain idle threshold is reached – say VM CPU is 85% idle or whatever is applicable to your specific application and its downtime requirements. Idle detection is part of the platform and alerts can be generated which can then be used to trigger automation to shut down the VM in question.

Right Sizing

When provisioning VMs for various application workloads it is worth spending time and effort on finding the correct metrics that the application under load generates. These are the CPU load, IO throughput, memory requirements, etc. It might take a day or two get this information by testing/loading the application prior to migrating it. This will pay rich dividends when you deploy to Azure as it will give the correct size when you come to choose the VM family and size in Azure instead of blindly choosing the size with no proper data behind it. A right sized VM also has the advantage of not having to constantly resizing it.

Azure Low Priority VMs

Low-priority VMs are allocated from our surplus compute capacity and are available for up to an 80% discount, enabling certain types of workloads to run for a significantly reduced cost or allowing you to do much more for the same cost. Hence, in addition to considering AHUB and RI, also consider using Low Priority VMs whenever it is appropriate. The tradeoff for using low-priority VMs is that those VMs may not be available to be allocated or may be preempted at any time, depending on available capacity.

The following links provide additional information on Low Priority VMs.

Low Priority VMs – Overview Using Low Priority VMs in Batch

Low-priority VMs are offered at a significantly reduced price compared with dedicated VMs. For pricing details, see Batch Pricing.

Scaling (Auto)

The Azure platform has baked in elastic and dynamic scaling that enable your azure resources such as compute to be dialed UP and DOWN in an automated fashion. The recommendation here is to make use of this platform capability in your solution deployments to scale UP and DOWN on demand, instead of using fixed size compute resources.

As a prelude to using Auto Scaling for your compute deployments, always deploy VMs into Availability sets or VM Scale Sets. Once you deploy the VMs into these platform constructs leverage the auto-scale capabilities in here. If you are deploying applications into VMs, a la the old-fashioned IaaS method, then the least the application must satisfy is, is that it must be a stateless application.

The following links provide various Scaling options.

Azure Platform Auto-Scaling capabilities Using Azure Automation for Scaling VMs VM Scale Set Overview Scaling Low Priority VMs

 

 

Performance optimization/performance POV

Choose the right storage

This was already covered in detail in the documentation on Storage Optimization. However briefly Azure provides the following types of storage capabilities from a replication perspective.

(a)    LRS – Locally redundant storage

(b)    ZRS – Zone Redundant storage

(c)     GRS – Geographically Redundant storage

(d)    RA-GRS Read Access – GA

Choose the appropriate storage based on application replication needs. Obviously, GRS types of storage costs slightly more.

Additionally, there are the options of

(a)    HDD [spinning old style disks] – cost effective choice for low throughput non-mission critical apps.

(b)    SDD [solid state drives] – Premium disks for high throughput I/O

(c)     Managed Disks – it is recommended all production and critical applications use managed disks.

Finally, the general recommendation is to use Managed Storage disks for all new VMs as they have significant advantages over non-managed disks and as well guaranteed SLAs besides not having to manage them as Azure manages them for the customer.

Choose the right type of VMs

Once the application team has decided that they would use IaaS based solution, landing on VMs, then spending time and deciding on the correct VM family and size will prove beneficial. There are several approaches to do this, some of which are outlined.

  1. For pure Lift and Shift type of scenarios, the current on-Prem deployment would serve as a starting baseline. Once you have the baseline, it is incumbent on each application team to profile the system characteristics under load. Once the load characteristics are known this can be mapped to the appropriate Azure VM Family and Azure Compute Units required to run the same application in Azure on VMs.

Choosing the appropriate platform construct

Many customers fall into this category where they do not have the time to understand the cloud native constructs provided by Azure and go for the easy pattern of lift and shift. While this is an easy mental model for experienced IT staff, it is not in tune with the various PaaS, SaaS and identity models that azure offers. This approach will almost invariably end up with choosing the wrong model and often advised by infrastructure-background personnel, who do not have the background to understand the developer capabilities of the platform, but who tend to push the discussion to the legacy approaches of deploying to VMs. From a cloud perspective these legacy approaches are rapidly become obsolete and as well as from an evolving software architecture point of view. This is a trap that should be avoided, and application teams should take the help of seasoned application developers who have kept abreast of various cloud native constructs which are baked right into the platform. The cost savings of such an approach are immense. Some of the benefits of microservices and cloud native capabilities are listed below.

(a)    Overall savings from blindly spinning up VMs – this is an old model which should be *really* be the last resort. Cost of managing the IaaS based deployments as well as constantly monitoring, patching, backing up and taking care of the environment. Also, the lack of auto-scaling which will not enable cost realizations.

(b)    Several hundreds to thousands of hours of savings for the development and deployment team as the cloud native constructs are built for rapid agile iterative cycles and super smooth DevOPS style deployment. None of these will be realized in the old-fashioned legacy way of deploying into VMS, by still wrongly adopting the technique of deploying into a VM/Server.

(c)     Look for cloud native constructs the platform provides – a few examples

  1. Use of Cloud neutral and OS neutral capabilities like Service Fabric (SF)
  2. If SF is not feasible deploy the application into containers
  3. Serverless constructs like Azure Functions

(d)    Use PaaS capabilities such as the following: -

  1. Azure Web Applications
  2. Azure Logic Apps
  3. Azure Event Hubs
  4. Azure Service Bus
  5. Azure Event Grid

Most enterprise message bus type applications can be implemented by a combination of Azure Logic Apps, Azure Functions (or Web Jobs) with durable functions and storage. Such decoupling will lead to significant cost savings over and above the existing legacy architectures, besides being more scalable and easy to manage, resulting in less overall costs. Application reviews done so far point to the fact that customers are still following the older IT model of deploying into servers and this is a serious issue from a software architecture perspective. Such teams end up carrying huge technical debt, and will have to undergo a painful refactoring later to modern software architectures which exploit cloud native capabilities, in order to realize the resultant benefits of rapid DevOPS, huge time-savings, cost benefits, agile development cycles, besides having a decoupled and a highly decomposed(componentized)  architecture, which allows for independent versioning and management (of the different components that make the) of the  solutions/applications that need to continually manage, evolve and operate.

 

REACTIVE Approach - things to do post deployment

Once the customer has started deploying into Azure then there are certain things that should be done post facto, in a continuous basis to wring out more cost efficiencies. These are spelt out below.

Azure Advisor

Turn on tools like Azure Advisor and implement all of its cost recommendations. The following link provides an intro to Azure Advisor which is a very simple tool to turn and as well as to implement its recommendations.

/en-us/azure/advisor/advisor-overview

Cloudyn type of tools

Cloudyn is another tool that can be used to study the usage and cost benefit analysis and it is recommended to turn this on and study its findings and recommendations and then implement them. Note that both Cloudyn and Azure Advisor use heuristics and AI to figure out usage patterns and make recommendations. So, the more these tools are turned on and have access to data over longer time periods their recommendations will make sense. The following link provides information on how to view Cost data with Cloudyn.

/en-us/azure/cost-management/quick-register-azure-sub

 

Check if deployments can be converted for Reserved Instance(RI)

After having deployed into Azure it is often the case that customers retroactively try to make sure they are exploiting Azure RI wherever they can. So, it is incumbent on Azure administrators or those responsible for the apps and its associated costs to periodically scan the usage data and figure out which are all the Azure VM instances that RI can applied to.

Check if deployments can be converted for AHUB

The same applies to leveraging existing enterprise licenses for Windows Servers which the organization has already paid for. For all Windows VMs the AHUB benefits should be turned on.

 

Study the entities within the RG

This is an activity that should be done periodically ideally atleast every couple of months by each application teams as part of the sprint cycle. The suggestion is to review the usage and cost data for every resource group in the monthly consumption data. By picking the top ten spenders in each RG and focusing on them and studying them and figuring out optimizations of these TOP TEN spenders, the following can be realized. Cost savings if any can be affected

  1. Valuable skills will be gained over a period of time by doing this exercise repeatedly.
  2. Rinse, repeat this atleast every 2 months.

 

Azure Compute VMs – Capability and Design Guidance

The sizing and tiering options provide customers with a consistent set of compute sizing options, which expand as time goes on. From a sizing perspective, each sizing series represents various properties, such as:

  • Number of CPUs
  • Memory allocated to each virtual machine
  • Temporary local storage
  • Allocated bandwidth for the virtual machines
  • Maximum data disks
  • GPU availability

VM Capability Guidance

 

Some virtual machine series includes the concept of Basic and Standard tiers. A Basic tier virtual machine is only available on A0-A4 instances, and a Standard tier virtual machine is available on all size instances. Virtual machines that are available in the Basic tier are provided at a reduced cost and carry slightly less functionality than those offered at the Standard tier. This includes the following areas:

Capability Consideration Capability Decision Points
CPU Standard tier virtual machines are expected to have slightly better CPU performance than Basic tier virtual machines
Disk Data disk IOPS for Basic tier virtual machines is 300 IOPS, which is slightly lower than Standard tier virtual machines (which have 500 IOPS data disks).
Features Basic tier virtual machines do not support features such as load balancing or auto-scaling.

The following table is provided to illustrate a summary of key decision points when using Basic tier virtual machines:

Size Available CPU Cores Available Memory Available Disk Sizes Maximum Data Disks Maximum IOPS
Basic_A0 – Basic_A4 1 – 8 768 MB –14 GB Operating system = 1023 GBTemporary = 20 - 240 GB 1 - 16 300 IOPS per disk

 

In comparison, Standard tier virtual machines are available for all compute sizes.

Capability Consideration Capability Decision Points
CPU Standard tier virtual machines have better CPU performance than Basic tier virtual machines.
Disk Data disk IOPS for Basic tier virtual machines is 500. (This is higher than Basic tier virtual machines, which have 300 IOPS data disks.) If the DS series is selected, IOPS start at 3200.
Availability Standard tier virtual machines are available on all size instances.
A-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       For A8, A9, A10, and A11 instances, hardware is designed and optimized for compute and network intensive applications including high-performance computing (HPC) cluster applications, modeling, and simulations.·       A8 and A9 instances have the ability to communicate over a low-latency, high-throughput network in Azure, which is based on remote direct memory access (RDMA) technology. This boosts performance for parallel Message Passing Interface (MPI) applications. (RDMA access is currently supported only for cloud services and Windows Server-based virtual machines.)·       A10 and A11 instances are designed for HPC applications that do not require constant and low-latency communication between nodes (also known as parametric or embarrassingly parallel applications). The A10 and A11 instances have the same performance optimizations and specifications as the A8 and A9 instances. However, they do not include access to the RDMA network in Azure.
Av2-Series features ·       Represent new version of A-Series VM’s with amount of RAM per vCPU raised from 1.75 GB or 7 GB of RAM per vCPU to 2 GB or 8 GB per vCPU.  Local disk random IOPS has been improved to be 2-10x faster than that of existing A version 1 sizes.
D-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       D-series virtual machines are designed to run applications that demand higher compute power and temporary disk performance. D-series virtual machines provide faster processors, a higher memory-to-core ratio, and a solid-state drive (SSD) for the temporary disk.
Dv2-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       Dv2-series, a follow-on to the original D-series, features a more powerful CPU. The Dv2-series CPU is about 35% faster than the D-series CPU. It is based on the latest generation 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, and with the Intel Turbo Boost Technology 2.0, can go up to 3.2 GHz. The Dv2-series has the same memory and disk configurations as the D-series.
Dv3-Seriesfeatures ·       Standard tier virtual machines include load balancing and auto-scaling.·       With Dv3-series, a follow-on to the original D/Dv2-series, Microsoft is introducing a new generation of Hyper-Threading Technology virtual machines for general purpose workloads
Ev3-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       A new family for memory optimized workloads - introducing sizes with 64 vCPUs on Intel® Broadwell E5-2673 v4 2.3 processor and with 432 GB of memory on the largest Ev3 sizes
DS-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       DS-series virtual machines can use premium storage, which provides high-performance and low-latency storage for I/O intensive workloads. It uses solid-state drives (SSDs) to host a virtual machine’s disks and offers a local SSD disk cache. Currently, premium storage is only available in certain regions.·       The maximum input/output operations per second (IOPS) and throughput (bandwidth) possible with a DS series virtual machine is affected by the size of the disk.
F-Series features ·       SKU is based on the 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, which can achieve clock speeds as high as 3.1 GHz with the Intel Turbo Boost Technology 2.0. Having the same CPU performance as the Dv2-series of VMs, at a lower per-hour list price, the F-series is the best value in price-performance in the Azure portfolio based on the Azure Compute Unit (ACU) per core. The F-Series VMs are an excellent choice for gaming servers, web servers and batch processing. Any workload which does not need as much memory or local SSD per CPU core will benefit from the value of the F-Series.
FS-Series features ·       FS-series virtual machines can use premium storage, which provides high-performance and low-latency storage for I/O intensive workloads. It uses solid-state drives (SSDs) to host a virtual machine’s disks and offers a local SSD disk cache. Currently, premium storage is only available in certain regions.·       The maximum input/output operations per second (IOPS) and throughput (bandwidth) possible with a DS series virtual machine is affected by the size of the disk.
G-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       Leverages local SSD disks to provide the highest performance virtual machine series that is available in Azure.
GS-Series features ·       Standard tier virtual machines include load balancing and auto-scaling.·       Leverages local SSD disks to provide the highest performance virtual machine series that is available in Azure.·       GS-series virtual machines can use premium storage, which provides high-performance and low-latency storage for I/O intensive workloads. It uses solid-state drives (SSDs) to host a virtual machine’s disks and offers a local SSD disk cache. Currently, premium storage is only available in certain regions.·       The maximum input/output operations per second (IOPS) and throughput (bandwidth) possible with a GS series virtual machine is affected by the size of the disk.
H-Seriesfeatures ·       H-series VMs are based on Intel E5-2667 V3 3.2 GHz (with turbo up to 3.5 GHz) processor technology, utilizing DDR4 memory and SSD-based local storage.  The new H-series VMs furthermore features a dedicated RDMA backend network enabled by FDR InfiniBand network, capable of delivering ultra-low latency.  RDMA networking is dedicated for MPI (Message Passing Interface) traffic when running tightly coupled applications.·       Provide great performance for HPC applications in Azure.  H-series VM sizes is an excellent fit for any compute-intensive workload.  They are designed to deliver cutting edge performance for complex engineering and scientific workloads like computational fluid dynamics, crash simulations, seismic exploration, and weather forecasting simulations.
N-Series features A new family of Azure Virtual Machines with GPU capabilities suited for compute and graphics-intensive workloads – aiding in scenarios like remote visualization, high performance computing and analytics. To be available in preview in Q1CY16 the N-series will be based on NVidia’s M60 and K80 GPUs and will feature the NVIDIA Tesla Accelerated Computing Platform as well as NVIDIA GRID 2.0 technology, providing the highest-end graphics support available in the cloud today

 

 

The following summary of the capabilities of each virtual machine series is provided in the following table:

 

Size Available CPU Cores Available Memory Available Disk Sizes Maximum Data Disks  Maximum IOPS
Basic_A0 – Basic_A4 1 – 8 768 MB –14 GB Operating system = 1023 GBTemporary = 20-240 GB 1 – 16 300 IOPS per disk
Standard_A0 – Standard_A11 (Includes compute intensive A8-11) 1 - 16 768 MB - 112 GB Operating system = 1023 GBTemporary = 20-382 GB  1 – 16 500 IOPS per disk
Standard_A1_v2 – Standard_A8_v2; Compute-intensive Standard_A2m_v2 - Standard_A8m_v2 1 - 8 2 GB – 64 GB Operating system = 1023 GBTemporary SSD disk = 10-80 GB 1 – 16 500 IOPS per disk
Standard_D1-D4 Standard_D11-D14 (High memory) 1 - 16 3.5 GB – 112 GB Operating system = 1023 GBTemporary (SSD) =50 – 800 GB 2 – 32 500 IOPS per disk
Standard_D1v2-D5v2 Standard_D11v2-D14v2 (High memory, faster CPU) 1 - 16 3.5 GB – 112 GB Operating system = 1023 GBTemporary (SSD) =50 – 800 GB 2 – 32 500 IOPS per disk
Standard_DS1-DS4 Standard_DS11-DS14 (Premium storage) 1 - 16 3.5 – 112 GB Operating system = 1023 GBLocal SSD disk = 7 GB – 112 GB GB 2 – 32 43 – 576 GB cache size3200-50000 IOPS total
Standard_F1 – F16 1-16 2-32 Local SSD disk: 16 GB – 256 GB 2-32 500 IOPS per disk
Standard_FS1 – FS16 1-16 2-32 Local SSD disk: 4GB – 64GB 2-32 4,000-64,000 IOPS
Standard_G1 – G5 (High performance) 2 - 32 28 GB – 448 GB Operating system = 1023 GBLocal SSD disk = 384 – 6,144 GB 4 – 64 500 IOPS per disk
Standard_GS1 – GS5 (High performance, Premium storage) 2 - 32 28 GB – 448 GB Operating system = 1023 GBLocal SSD disk = 56 – 896 GB 4 - 64 264 – 4224 GB cache size5000-80000 IOPS total
Standard_H8 - H16 8-16 56 GB - 224 GB Local SSD Disk: 1,000-2,000 GB 16 - 32 500 IOPS per disk

 

 

 

The table below also describes differences between A-series and Av2-series

 

SIZE vCPU RAM (GiB) Disk Size Size vCPU RAM (GiB) Disk Size
A1 1 1.75 20 GB (HDD) A1_v2 1 2 10 GB (SSD)
A2 2 3.50 70 GB (HDD) A2_v2 2 4 20 GB (SSD)
A3 4 7 285 GB (HDD) A4_v2 4 8 40 GB (SSD)
A4 8 14 605 GB (HDD) A8_v2 8 16 80 GB (SSD)
A5 2 14 135 GB (HDD) A2m_v2 2 16 20 GB (SSD)
A6 4 26 285 GB (HDD) A4m_v2 4 32 40 GB (SSD)
A7 8 52 605 GB (HDD) A8m_v2 8 64 80 GB (SSD)

 

These sizing and capabilities are for the current Azure Virtual Machines, and they might expand over time. For a complete list of size tables to help you configure your virtual machines, please see: Sizes for Virtual Machines.

 

Azure VM Design Guidance

 

Design Guidance

 

When you design solutions for using virtual machines, consider the following:

Capability Considerations Capability Decision Points
Deployment order If you intend to deploy an application that may require compute intensive resources, it is recommended that customers provision a virtual machine to a cloud service with the largest virtual machine (such as Standard_G5) and scale it down to a more appropriate size. The reason is that virtual machines will be placed on the clusters that have the faster processors. It also makes scaling easier and it is more efficient to combine resources.
Supportability The following are not supported in a virtual machine on Microsoft Azure:·       Multicast·       Layer-2 routing·       32-bit OS versions·       Windows Server OS versions prior to Windows Server 2008 R2Note: Windows Server 2003 / 2008 32-bit OS versions are supported for deployment on Azure virtual machines with substantial limitations, including no support for agents or extensions.

 

 

Usage Analysis Tools

The following tools are available as part of the platform and consider turning them on.

(A)    Azure Advisor

(B)    Azure Security Center

(C)    Cloudyn

They analyze the underlying data emitted by various Azure resources and arrive at recommendations from the perspective of

(i)                  Cost

(ii)                Security

(iii)               Performance

In most cases these recommendations are implementable in an automated fashion, if not via a single click. One can also review and implement the recommendations manually if one prefers to do that.