Since SAP Certified Azure as a supported platform for production deployments of many SAP software solutions a large number of customers have deployed non-production and production systems on Azure
This blog details some of the most important feedback, concepts and topics we have observed from these first deployments
1. Planning & Documentation Required Before Deploying SAP on Azure
Before deploying SAP software on Azure it is absolutely essential to completely read and understand the SAP on Azure documentation: https://msdn.microsoft.com/library/dn745892.aspx
Within the next few weeks there will be an update of those documents to represent new developments in Azure.
Care must be taken to follow all the guidance and recommendations contained in these documents. These configurations have been tested and proven and should be followed by all customers.
Before deploying SAP software on Azure it is essential document and plan the configuration of VM types, networking, storage and disk layout before deploying. Before deploying any objects in Azure a comprehensive landscape diagram in Visio and inventory should exist.
It is very important to develop a clear naming convention for all Azure objects. Examples of a naming convention are below.
Example 1 – A virtual machine
A Production ECC database server VM with a SID “ECP” – vmprdecpdb01
vm denotes the Azure object is a Virtual Machine
prd denotes the object belongs to the Production landscape
ecp identifies the SAP SID the VM belongs to
01 identifies the object is the first in a series (02 would be the HA pair)
Example 2 – A Storage Account
A Storage Account used for non-SAP workloads – saprdgeoneu01
sa denotes the Azure object is a Storage Account
prd denotes the object belongs to the Production landscape
geo identifies the storage account is Geo-redundant
neu identifies the storage account is in North Europe
01 identifies the object is the first in a series
Example 3 – An Internal Load Balancer
A SQL Server AlwaysOn HA availability solution for a SAP system with a SID ECP requires an Azure Internal Load Balancer – ilbprdecp
ilb denotes the Azure object is an Internal Load Balancer
prd denotes the object belongs to the Production landscape
ecp identifies the SAP SID that the ILB belongs to
The above are just an example and customers should create a naming scheme that best suits their organization and policies, however it is essential to have a clear naming standard.
Unless a clear and coherent naming scheme is established and implemented it will be difficult to clearly identify and troubleshoot even a medium sized Azure deployment.
2. SAP on Azure Sizing & Landscapes
There are too many topics in SAP Sizing and SAP Sizing on Azure to discuss in detail in a summary blog like this one, but this blog will provide a “Checklist” for Sizing:
1. Obtain accurate information about the current SAP landscape SAPS
2. Make sure to carefully separate Database SAPS (Scale up only) from SAP Application Server SAPS (which can scale out). Dedicated DB SAPS is typically less than 20% of total SAPS. Do not directly map DB/CI total SAPS to a dedicated DB server on Azure.
3. Check if the maximum DB SAPS is larger than listed in 1928533 – SAP Applications on Azure: Supported Products and Azure VM types
4. Carefully plan and size Storage Accounts. Consider each storage account a “virtual SAN” with a limited number of IOPS and throughput. Consider separate Storage Accounts for Development, QAS and Production for any sized landscape. Consider dedicated storage accounts and/or Premium Storage for medium or larger systems
5. Carefully plan VNets. Multiple network adapters inside a virtual machine do not increase the network throughput
6. Test on-premise to Azure latency with the Azure Speed Test utilities (see later in this blog). Express Route maximum bandwidth and individual gateway bandwidth. Express Route Premium increases the routing table size
7. On-premise deployments are sized with large amounts of buffer to allow infrastructure to run for 3-5 years or more. Azure deployments do not require a large sizing buffer as more resources can be added as the business volume increases, when acquisitions occur or at peak load (such as year end).
8. Consider to use Autoscale to run SAP application servers only when required. Remember to set Autoscale down increment to 0.
3. SAP on Azure Support Matrix
SAP has certified and supported all NetWeaver 7.0 and higher based ABAP and Java based applications for productive use on certain Azure VM types. SAP Note 1928533 – SAP Applications on Azure: Supported Products and Azure VM types is updated with the most recent information.
Operating Systems (April 2015)
The Azure platform supports both Windows and Linux. Microsoft and SAP plan to support both Windows and Linux distributions for SAP customers.
Future Planning: SUSE Linux and possibly others
Released: Windows 2008 R2, Windows 2012 & Windows 2012 R2
Recommended: Windows 2012 R2
Comment: New deployments on Windows 2008 R2 are not recommended. Use Windows 2012 R2 or Windows 2012 for Oracle customers
Databases (April 2015)
Azure platform is open and supports many non-Microsoft applications and databases. Oracle, IBM DB2 and Sybase are all supported on Azure for non-SAP workloads already.
Future Planning: Other DBMS platforms currently supported for on-premise deployments
Released: SQL Server 2008 R2, SQL Server 2012, SQL Server 2014, Oracle 11gR2 188.8.131.52, Sybase ASE 16.0 PL02
Recommended: SQL Server 2014, SQL Server 2012 or latest releases of other DBMS
Comment: It is recommended to deploy the latest DBMS and patches. New deployments of SQL Server 2008 R2 are not recommended.
Standalone Engines & non-NetWeaver applications (April 2015)
Due to the very large number of SAP and 3rd party applications resold by SAP there is no plan to individually certify each standalone engine and non-NetWeaver applications
Due to customer requests Microsoft and SAP have tested some of the most popular components are provided support statements
Future Planning: Other standalone engines
Released: TREX, Business Objects
Comment: It is recommended to test Standalone Engines & non-NetWeaver applications on Azure before running in production.
If a SAP application requires higher disk IOPS and lower latency than Standard Storage can provide then Premium Storage should be used.
4. Storage Design for SAP on Azure
There are at least 3 types of Azure storage relevant to SAP on Azure
Azure blob storage is suitable for almost all small and medium sized SAP systems. Standard Azure blob storage has four variants:
Locally Redundant – keeps 3 data replicas within a single facility within a single region. Use this type of Storage Account for all SQL Server or DBMS storage
Zone Redundant – more information here
Geo-Redundant – data is replicated between fixed pairs of facilities asynchronously. Do not use this type of storage account for DBMS workloads. SAP application directories (such as the Transport directory) can use this storage type. By default GRS is selected and therefore care must be taken to disable this feature for DBMS systems
Geo-Redundant Read Only – a variation of geo-redundant that allows read only access
For SAP application servers it is strongly recommended to install the \usr\sap\<SID> directory onto C: drive and not to create a new Azure disk just for the SAP installation path.
Cache settings for Standard Storage
Cache settings can be configured for each disk stored on a specific storage account. Disable write caching for all disks on conventional Blob storage other than the OS boot disk. All VMs types supported for SAP applications can use standard Blob storage. It is recommended to create disks at the maximum size (currently 1TB) as Azure billing only charges for allocated and used space for Standard Blob Storage
Non-persistent local storage
Each Azure VM has a specific amount of local non-persistent (meaning certain operations will initialize the disk) storage. The amount of non-persistent storage depends on the VM type. Larger VMs generally have more non-persistent storage. On Windows the non-persistent storage is presented as drive D:. It is recommended to avoid installing software on the drive letter “D:” for any on-premise deployments as this will cause complications for scenarios like Azure Site Recovery. D: drive on VMs larger than A7, D, DS and G series can all be considered high performance.
Tempdb and the SQL Server 2014 buffer pool extension can be placed onto this temporary SSD disk. It is recommended to put tempdb into the root of D: otherwise SQL Server may not startup if the path is unavailable.
Azure Premium Storage is backed by SSD disks and is currently only available on DS series VMs
Leave default “Read Only” cache set for most workloads except disks that are predominantly write. Write intensive workloads such as the DBMS transaction log switch off caching altogether.
Premium Storage is particularly useful for DBMS transaction logs.
5. Networking Design for SAP on Azure
Azure networking is a broad topic with several important technologies.
This blog can only cover these topics briefly. The SAP deployment documentation has more information about the configuration of Azure VNETs. Care should be taken to size VNET subnets sufficiently large to support future requirements. There is no easy way to reconfigure VNETs after creation.
Common Network Topologies
1. Dev & QAS on Azure with a Site-2-Site (S2S) VPN or ExpressRoute back to on-premise where Production is running. SAP TRANS_DIR is typically kept on an Azure VM.
2. Dev & QAS on Azure with S2S/ExpressRoute connecting to on-premise where Production is running. Production replicating to Azure using Azure Site Recovery for Disaster Recovery. SAP TRANS_DIR is typically kept on an Azure VM.
3. Dev, QAS, Prod & DR all running on Azure. S2S/ExpressRoute connecting back to the internal Corporate network to provide user connectivity. SAP TRANS_DIR is kept on an Azure VM.
Azure speed test
The performance of a Site-2-Site VPN can be approximated using these tools:
Perform this test from a WIRED connection close to the core switch. Do not run this test on a wireless link to on a laptop on a user VLAN. Most customers achieve 5-60ms latency from their site to the nearest Azure Region, if values are significantly higher than this are observed this may indicate routing issues at an ISP. Try testing using another ISP (even a residential Fibre or DSL link can provide a useful comparison)
S2S VPN is a link between a customer network and a specific Azure data center across the Internet secured with industry standard encryption.
The maximum aggregate bandwidth of S2S is much lower than ExpressRoute. The High Performance Gateway provides for 200 megabits per second.
It is also possible to create a vnet-2-vnet vpn. This kind of vpn can join two different vnets even if they are in different datacenters. This might be required when creating a Windows cluster spanning from Singapore Region to Hong Kong Region for example.
Note: the “key” at the very last step is not the Storage Key obtained from the Azure Portal, it is any Hexadecimal key and must be the same for both commands
Point to Site VPN
Point to Site VPN allows individual PCs or devices to securely connect to a VNet. This functionality is useful for DR configurations or remote offices.
ExpressRoute is a direct private link between a customer network and a specific Azure data center.
The maximum aggregate bandwidth of ExpressRoute is 10 gigabits per second. The maximum bandwidth to a specific VNET is substantially less. The High Performance Gateway provides for 2 gigabits per second.
Typically SAP Router connections use Forced Tunneling to route back via the main enterprise firewall.
6. High Availability on Azure
The Azure IaaS platform SLA is documented in detail and available for download. In summary Microsoft commits to having Virtual Machines configured in an Availability Set available to external sources 99.95% of the time
Occasionally the Azure platform itself will be updated in order to provide more features and functionalities. There are ways to minimize the impact of planned Azure restarts through various techniques including application level High Availability
Database High Availability
Many DBMS High Availability solutions on Windows require shared disks. The Azure platform fully supports Windows clustering but does not currently natively offer shared disks. iSCSI on Azure is not recommended.
SQL Server AlwaysOn is a new HA technology that does not require shared disk. SQL Server AlwaysOn is documented and supported on Azure and is currently the only HA solution for SAP DBMS platforms.
SAP does not currently support the Azure SQL DB which is a PaaS service.
SAP ASCS High Availability
The SAP single points of failure (SPOF) currently require a shared disk. This blog explains how the SIOS Datakeeper tool is used to provide High Availability for the SAP SPOF. Clustering SAP ASCS Instance using Windows Server Failover Cluster on Microsoft Azure with SIOS DataKeeper and Azure Internal Load Balancer
SAP Application Server Availability
The availability of SAP application servers is improved by configuring Autostart. In a scenario where an Azure component fails and the Azure platform self-heals and moves a VM to another node the impact of this restart is much less if the application servers restart automatically.
Autostart is configured by adding Autostart = 1 to theSAP default.pfl
To achieve high availability on Azure the following is required as a minimum.
1. Two DBMS VMs* in one Azure Availability Set. Use native DBMS layer technologies to achieve HA
2. Two VMs that run ASCS and ERS in their own Azure Availability Set
3. Two VMs running SAP application instances in their own Azure Availability Set
Do not locate app servers onto DB or ASCS VMs
*small databases can be consolidated onto one SQL Server AlwaysOn Availability Group
7. Disaster Recovery on Azure
Azure Site Recovery (ASR) provides a comprehensive disaster recovery solution suite. Typical RPO with ASR should be approximately 15 minutes or less, an RTO of 5-60 minutes and a very quick and effective way to do a Disaster Recovery Test. ASR allows customers to test DR Failovers in a safe, secure and non-disruptive by copying the DR resources to temporary resources. Performing a DR test does not compromise the DR infrastructure readiness.
There are several technologies and scenarios. The Azure Site Recovery is comprised of two fundamental components:
Data Replication Component (Replication Channel) – replicating VMs or applications from one site to another
Orchestration Component – scheduling, sequencing and coordinating disaster recovery failover and failback recovery plans. An example might be ensuring Active Directory services are available before attempting to start VMs that have Services that require Domain authentication.
ASR can support three different scenarios:
ASR E2E scenarios use the on-premise Data Replication mechanism to replicate data from one on-premise site to another. Failover and Failback Orchestration is controlled via cloud services in Azure.
ASR E2A scenarios use the on-premise Data Replication mechanism (Hyper-V Replica, InMage Agents) to replicate data from on-premise to Azure. Orchestration is controlled via cloud services in Azure.
ASR A2A scenarios use the Azure Data Replication mechanisms to replicate data from one Azure Region or Data Center to another Azure Region or Data Center. Orchestration is controlled via cloud services in Azure. This feature is currently under development
The ASR product offers two technologies:
ASR based on Hyper-V Replica
Hyper-V has a built in technology to replicate VMs from one Hyper-V cluster to another Hyper-V cluster. ASR for Hyper-V extends this from an E2E scenario to an E2A scenario. The technology is virtualization host based and is simple and easy to deploy and configure. Because the solution is host based this technology works only with Hyper-V guests.
Status: Azure Site Recovery based on Hyper-V is fully released and GA for E2E and E2A scenarios. A whitepaper has been written describing how to implement a ASR Hyper-V solution for SAP. Protecting SAP Solutions with Azure Site Recovery
ASR based on InMage
Microsoft acquired InMage and has integrated this technology into Azure Site Recovery.
InMage tools are Guest based meaning InMage deploys an Agent into the VM. InMage toolset can therefore be used on Physical servers, VMWare or Hyper-V. InMage also has additional functionalities such as the ability to take consistent snapshots of database servers (by calling the VDI interface), configure groups of VMs that are consistent at an exact point in time and intelligently avoid replication on OS files such as the pagefile.
Status: Azure Site Recovery based on InMage is fully released and GA for E2E. E2A scenarios are in Preview as at April 2015
8. How to Upload SAP Databases to Azure
There are several ways to transfer a database from on-premise into Azure public cloud.
There are 3 key factors to consider:
1. Size of the Database
2. Acceptable Downtime
3. Source and Target Database Version
If the compressed database size is less than 1-2TB then experience has shown most customers can upload to Azure using tools such as AzCopy in an acceptable timeframe.
If the compressed database size is larger than 2-3TB then some customers are using SQL Server Log Shipping, Mirroring or AlwaysOn to synchronize databases between on-premise and Azure. A brief outage can then be taken, final transaction log backup and restore done and then switch over to the copy of the database in Azure.
If the compressed database size is truly huge then it is recommended to use the Azure import/export service
9. VM Types Resize Vectors
It is possible to increase or decrease from any VM type to any other VM type by recreating the VM from the original storage. Automatic resizing from the Azure Portal or Powershell is possible for some combinations of VMs (such as from A5 -> A7). Special care needs to be taken when changing VMs contained within a cloud service since there are special restrictions.
10. What Use Cases are Popular for Azure?
Based on our discussions we see very clear patterns:
1. Customer with SAP project implementations are deploying an entire landscape (dev, qas and production) on Azure. Few that don’t are putting sandbox, dev, qas and DR on Azure and Production on Hyper-V and leveraging Azure Site Recovery
2. Development, QAS and temporary project systems for upgrades and support packs are the most common use case for SAP on Azure
3. Customers are moving old “Legacy” archived systems from UNIX/Oracle or DB2 to Windows and SQL on Azure. The customer will typically not buy a SQL Server license especially where such a system will run only a few hours per month, but will use the pay-per-minute SQL Server VM image. These “Legacy” archive systems are typically old SAP implementations that are now no longer active, but must be kept for tax and compliance purposes. Sometimes these systems run on expensive to maintain IBM or HP UNIX servers that use a lot of data center space, electricity and cooling. If legal requirements mandate that the data must be resident in a particular country, then the customer can take a compressed backup of the “Legacy” archive system and store the backup on-premise. In most jurisdictions this meets the legal requirements
4. SAP Disaster Recovery Systems on Azure. Another popular use case is to use Azure as a Disaster Recovery data center and use Azure Site Recovery as a very cost effective solution to provide DR
5. Since the release of Premium Storage and larger VMs more and more customers are deploying Production systems onto Azure.
Azure Site Recovery Blogs
http://azure.microsoft.com/en-us/services/site-recovery/ (Sign up for Free Trial)
Win2012 R2 Hyper-V + SQL Server 2014 Free Trial Software
How to setup AlwaysOn between on-premise and Azure
http://weblogs.asp.net/scottgu/azure-new-documentdb-nosql-service-new-search-service-new-sql-alwayson-vm-template-and-more (go to section Virtual Machines: Support for SQL Server AlwaysOn, VM Depot images)
AlwaysOn Availability Groups Now Support Internal Listeners on Azure Virtual Machines
How to setup AlwaysOn between Azure Regions
Test Latency between on-premise and Azure Regions – WARNING: For accurate test results run this on a server on a wired connection somewhere close to the core switch. Results on laptops on wireless connections are not accurate