Protecting SAP Systems Running on VMware with Azure Site Recovery

Introduction

Azure Site Recovery delivers a powerful toolset to customers seeking to deploy or improve their Disaster Recovery solutions.

The key differentiators between Azure Site Recovery (ASR) and competing technologies:

  1. Azure Site Recovery substantially lowers the cost of DR solutions. Virtual Machines are not charged for unless there is an actual DR event (such as fire, flood, power loss or test failover). No Azure compute cost is charged for VMs that are synchronizing to Azure. Only the storage cost is charged
  2. Azure Site Recovery allows customers to perform non-disruptive DR Tests. ASR Test Failovers copy all the ASR resources to a test region and start up all the protected infrastructure in a private test network. This eliminates any issues with duplicate Windows computernames. Another important capability is the fact that Test Failovers do not stop, impair or disrupt VM replication from on-premise to Azure. A test failover takes a “snapshot” of all the VMs and other objects at a particular point in time
  3. The resiliency and redundancy built into Azure far exceeds what most customers and hosters are able to provide. Azure blob storage stores at least 3 independent copies of data thereby eliminating the chances of data loss even in event of a failure on a single storage node
  4. ASR “Recovery Plans” allow customers to create sequenced DR failover / failback procedures or runbooks. For example, a customer might create a ASR Recovery Plan that first starts up Active Directory servers (to provide authentication and DNS services), then execute a PowerShell script to perform a recovery on DB servers, then start up SAP Central Services and finally start SAP application servers. This allows “Push Button” DR
  5. Azure Site Recovery is a heterogeneous solution and works with Windows and Linux, supports VMware 5.0, 5.5 and 6.x and works well with SQL Server, Oracle, Sybase and DB2.

The Azure Site Recovery toolset includes solutions for different Hypervisors and deployment patterns.

There are two tools:

H2A – Hyper-V to Azure uses the native Windows Hyper-V Replica technology to replicate VMs to Azure. This is a Host based replication technology.

V2A – VMware or Physical Servers to Azure uses the InMage toolset to replicate VMs to Azure. V2A is a guest based technology.

More documentation is linked for H2A and V2A

ASR is lower cost than all competing technologies because only the storage cost is charged while VMs are in Synchronization Mode.

ASR also allows customers to do a complete DR test without compromising the DR capabilities. ASR copies all resources before completing a test failover and there is no impact on the DR SLA during DR testing

Top 10 Key Points for Protecting SAP Systems Running on VMware with ASR

  1. Use ASR to protect SAP Application Servers and Central Services Servers.
  2. Use the native DBMS replication technology to protect database (such as SQL Server AlwaysOn or Oracle DataGuard)
  3. ASR Disaster Recovery scenarios work best with 3 tier configurations (DBMS running on separate VM with no SAP instances). ASCS instances should not have SAP application servers installed
  4. The DR DBMS server in Azure can be scaled down to smaller and lower cost VMs. The VM CPU and RAM can be upgraded if a DR failover occurs
  5. Setup Active Directory and DNS in Azure first. Test ASR scenario by failing over and failing back a non-SAP test VM. Once failover and failback is tested successful, only then deploy on the SAP servers. Check name resolution and AD authentication is working correctly
  6. Ensure network bandwidth from the VMware cluster to Azure is sufficient to replicate. Monitor in the Azure Portal
  7. Do regular Test Failovers into an isolated vNet to test the solution. Test functionality and performance. Increase target Azure VM sizes as required
  8. Use technologies such as SQL Server Transparent Data Encryption to secure the database and backups. Use SQL Server Connector for Azure Key Vault to centrally manage Certificates for TDE
  9. Always use Azure Premium Storage for DBMS datafiles for SAP systems. Premium storage is of no use for SAP application servers
  10. Setup one or more Domain Controller(s) on a small VM in Azure and allow AD and DNS to synchronize over the VPN or ExpressRoute link. Name resolution is a critical service if a DR event occurs

Prerequisites

Networking VPN or ExpressRoute is a prerequisite for Failback. It is possible to synchronize VMs from on-premises to Azure and perform a failover without VPN or ExpressRoute as this traffic runs over https port 443. Failback requires access direct to the VMware cluster via ExpressRoute or VPN.

Foundation Services include Networking, Active Directory and Name Resolution Services.

SAP systems running on Windows require Active Directory and Domain accounts must be used for 3 tier systems.

DNS services are essential for consistent name resolution.

It is technically possible to use ASR to replicate a Domain Controller, but this is not recommended. Rather than failing over AD VMs it is recommended to setup a small low cost VM in Azure that is permanently running and synchronizing.

Azure Site Recovery is a Multi-OS solution. Both Windows & Linux are supported. It is recommended to use a Windows Server as the Configuration/Management server since monitoring is simpler on Windows. A Windows Configuration server can monitor and deploy to Linux Guest VMs.

Recommendation:

  1. Setup VPN or ExpressRoute
  2. Setup at least one Active Directory Domain Controller and DNS Server in Azure. AD and DNS configuration will replicate over the VPN or ExpressRoute Connection. Assign a static IP address to this VM.

Recommendations & Limitations

3-Tier systems are strongly recommended for all but the very smallest SAP systems running on Azure.

The use of ASR to provide DR protection for 2-Tier SAP landscapes is not recommended, is not tested and is not discussed in this blog. Production systems should leverage the native DBMS replication technology such as SQL Server AlwaysOn.

Premium Storage is generally recommended for the DBMS datafiles for all SAP systems, especially production systems.

At the time of writing (April 2016) ASR does not support the following features. Many of these will be released in the coming weeks and months and will be removed from this list:

  1. ASR replicating a VM onto Premium Storage. Note: it is possible to setup the native DBMS replication technology (AlwaysOn, Dataguard etc) on a VM in Azure that does use Premium Storage.
  2. Azure Resource Manager (ARM) deployments are not yet supported
  3. Shared Disk clusters are not supported
  4. VMs with multiple IP addresses and VMs with a single IP in the same vNet
  5. Advanced Disk Encryption is not supported (Bitlocker on target VMs)
  6. VMs with a single disk larger than 1TB are not supported on Azure
  7. Windows 2016 is currently not supported by ASR
  8. While it is technically possible to change the Azure temporary storage disk from D: to another drive letter, this increases complexity. If possible change the VMs on VMware so they are not using drive letter D:

Deployment Phase

Task 1: Establish Foundation Services

  1. Create a vNet in Azure and setup subnets in the chosen target Azure location. This should be the same location where the Azure Recovery Vault will be created
  2. Setup Site to Site VPN or ExpressRoute. Connect VPN or ER to the SAP vNet
  3. Deploy a small VM in the SAP vNet that will function as the AD Domain Controller for the ASR vNet
  4. Assign a static IP address to the Active Directory VM

    Get-AzureVM -ServiceName “xxx” -Name “xxx” | Set-AzureStaticVNetIP -IPAddress x.x.x.x | Update-AzureVM

    *See the appendix for the ARM command line for static private IP address.

  5. Install the AD and DNS roles and join to existing Corporate AD.
  6. Register the AD and DNS VM as the “DNS1” IP address for the SAP vNet via the menu option in the Azure Portal (path New -> Network Services -> Virtual Network -> Register DNS Server). Assign DNS1 to the vNet
  7. Carefully test AD services and DNS resolution. Create a test VM and failover and failback. Run nslookup in the on-premises and Azure locations. If there are inconsistencies wait for some time for DNS synchronization to occur

Task 2: Follow Azure Installation Procedure on SAP Application Servers and Central Services

The ASR documentation is very clear. It is strongly recommended to watch the videos before deploying each step.

It is recommended to review the entire documentation end to end at this link

https://azure.microsoft.com/en-us/documentation/articles/site-recovery-vmware-to-azure-classic/

Start at Step 1 and continue through Step 11. The documentation is complete and should be simple to follow

As discussed earlier it is recommended to create a small simple VM on VMware and test steps 1 – 11 on this non-SAP VM before attempting to deploy ASR to the SAP landscape

The steps to deploy the Management Server(s) [Step 5] are here

https://azure.microsoft.com/en-us/documentation/articles/site-recovery-vmware-to-azure-classic/#step-5-install-the-management-server

Important Points:

  1. Review the sizing and disk requirements for the Management Server(s)
  2. Ensure the VMware admin account (used to connect ASR to vCenter) and Windows Admin accounts (used to deploy the replication agents into the Guest VMs) are identified
  3. Ensure the Firewall and Ports are opened as documented
  4. Consider using VMware HA to protect the SAP ASCS instead of a shared disk cluster
  5. Ensure that the VMs do not have any one individual disk larger than 1023GB (1TB)
  6. Use the latest release of VMware vSphere PowerCLI 6.0 – downloaded from VMware (free registration required)
  7. ASR obengine.exe and other processes may need to access Internet via the corporate proxy server. A user id may be required. See the documentation for PowerShell to configure the proxy settings including the password
  8. Take care when entering user/pw information into cspsconfigtool.exe
    1. Typically there would be at least two username and passwords stored in this utility
    2. One ID would be for vCenter – for example “root” and password xxxxxx
    3. One ID would be a user account that ASR uses to remotely install the ASR replication agents onto the VMware VMs. The permissions required are in the installation documentation. It is recommended to use a Domain account – for example “DOMAIN\ASRInstaller” and password xxxxxx
    4. After entering data with cspsconfigtool.exe it will take some time for this to be synchronized with the Azure Backend Services. To speed up this process click on “Refresh” button on the Configuration Server under the “Servers” menu in the Azure Portal


  9. Each VM must be sized and the network configured
    1. Azure Dv2 Series is an excellent choice for SAP application servers as these VMs have powerful new Intel processors


  10. Replication Settings of each Recovery Group
    1. SAP Application Servers do not contain data that requires a low RPO. SAP Central Services and SAP Application Servers generally only contain workprocess trace files, job log files, security logs and other technical logs
    2. SAP Application Servers must not be used as File Servers. Do not place interface or download/upload shares and paths on SAP Application Servers. This is a severe security risk and bad operational practice
    3. In general the RPO threshold for SAP Application Servers can be set to around 120 minutes


Task 3: Setup Native DBMS Replication from on-premises to Azure VM

The ASR Toolset for VMware and Physical servers can create consistent replicas of DBMS servers on both Windows and Linux such as SQL Server.

Due to the size and data churn volumes with SAP databases it is a general recommendation to use the native DBMS replication technology to replicate the database.

Using the native DBMS replication allows greater control over restore and recovery points. For example if there was a failure in the on-premises datacenter, it may be possible to take a Tail of Log backup from SQL Server and apply this to the database in Azure (which will be in recovery mode).

Guidance for setting up DBMS in Azure:

  1. SQL Server 2012 and higher Integration with ASR and Hybrid documentation
  2. Oracle 12c DataGuard documentation
  3. Other DBMS such as MaxDB, Livecache, Sybase – refer to vendor documentation

After the installation of the DR solution for the DBMS server conduct tests and become familiar with any change to the application connection string if required (not required with SQL Server AlwaysOn).

Task 4: Protect the SAP Transport Directory and other File System Level VMs

The SAP Transport Directory, File servers for Interface files and SAP and non-SAP standalone engines also need to be protected.

There are several options for handling the SAP Transport Directory and interface directory:

  1. Use a Powershell script to copy files to Azure Files
  2. Use a Powershell script to copy files to a standalone file server in Azure
  3. Use DFS-R in combination with a CNAME to replicate files automatically

Other SAP and non-SAP standalone applications need to be tested with ASR on a case by case basis.

Test Failover

Test failover into an isolated vNet without connection back to live users, printers and interfaces is recommended.

A Test Failover does not impair the DR SLA because all of the VMs and objects are cloned as part of a test failover.

The original VMs continue synchronizing and the DR SLA is not impacted.

Even during the execution of a Test Failover it is possible to execute a “real” Failover should a DR event occur during a Test Failover

After executing a Test Failover it may be required to add a public RDP endpoint onto one VM in order to access the VMs. This is because there is no external connectivity to the Test Failover vNet as there is no ExpressRoute or VPN

A typical sequence for a Test Failover is:

  1. Create a new vNet and assign ip subnets. Connecting this vNet with VPN or ExpressRoute is not recommended as this may lead to issues with duplicate Windows hostnames and might risk DR test activities communicating with live interfaces
  2. Clone the AD Domain Controller in Azure and copy to new vNet
  3. Register the AD Domain Controller as the DNS1 on the test failover vNet
  4. Restore the DBMS backup & transaction logs onto a new VM and run SAP DBMS specific post processing to create users. Note: this step is a very useful test of the Backup/Restore mechanism
  5. Press “Test Failover” and select the vNet created in step 1
  6. Test applications – typically a VM is created in the vNet with SAPGUI
  7. At the conclusion of testing press the “Confirm” button in ASR portal – this will purge all ASR managed VMs
  8. Delete AD and DB VMs
  9. Delete the test vNet

Illustration of selecting a vNet during a test failover

Failover

It is recommended to create a Recovery Plan to sequence and orchestrate the failover.

It is possible to make one recovery plan that would failover the entire SAP production environment, meaning one recovery plan would trigger ECC, BW, PI and all other Production systems to failover. Most customers do not prefer this. It is generally recommended to create a Recovery Plan for each SAP application. Examples: RecPlan-ECC-Prod, RecPlan-SCM-Prod, RecPlan-SCM-Livecache

The Recovery Plan allows sequencing the startup of VMs. For example it is recommended to put the Central Services into a separate Recovery Plan Group and to ensure these are started before starting the SAP Application Servers

A typical sequence for a Failover is:

  1. Press “Failover”
  2. Select the “Latest recovery point in time” and “Shutdown” VMs
  3. Recover the databases in Azure (if required)
  4. Press the “Reprotect” to reverse the replication from Azure back to on-premises VMware cluster

Failback

The Failback procedure is similar to failover. At the conclusion of a successful failover, press the Reprotect button to enable synchronization from VMware based VMs to Azure

Protection Against Total Loss of Corporate Network

Azure Site Recovery can be used in combination with other Azure features and capabilities to protect against the total catastrophic loss of an on-premises datacenter and all the associated WAN connections to branch offices. For example, RemoteApp with a custom server image including necessary SAP client software can be planned to protect SAP client services.

RemoteApp goes here: https://azure.microsoft.com/en-us/documentation/services/remoteapp/

Create a Custom Image, install SAPGUI and sysprep the Image. Make sure SAPGUI is in the Start Menu and Publish SAP Logon via the Start Menu in the Azure Portal.

If more complex scenarios such as BEx and other GUI features are used test these well. Check the RemoteApp documentation for Active Directory authentication options. Ensure the AD design is tolerant of the complete loss of the on-premises AD infrastructure.

Protecting Non-NetWeaver Based Applications

SAP applications that are not based on ABAP or Java can be protected with ASR as well. Care should be taken to validate the amount of “churn” or changes on the file system(s). SAP ABAP or Java servers have very little churn, but systems such as TREX could have considerable amounts of writing. It is recommended to use the Azure Site Recovery Capacity Planner tool. This is typically not required for ABAP and Java systems as the amount of data written to these VMs is small.

Stand-alone component Description Example
File-system based These components can be protected by Azure Recovery Services by simply replicating the VM asynchronously. In general, we recommend minimizing the number of disks on the VM and, ideally, implementing the operating system and application on C: drive, if possible. CTM Optimizer
Adobe Document Server
KW
TREX

Transport & Interface Directories

DBMS-based These SAP stand-alone components use a database and, therefore, must be protected using database tools and Azure Recovery Services. For more information, see the appropriate SAP guidance, such as the LiveCache High Availability Guide. LiveCache
Business Objects
Content Server

It is critical to ensure the server holding the Transport Directory and any interface directories is also protected.

It is strongly recommended to use a dedicated Windows server for file server purposes (such as DIR_TRANS) and not to use SAP application servers as file servers. One option to protect and synchronize transport and other interface directories is to use Windows DFS-R.

The SAP /sapmnt file system should never be used for storing interface files or any other data.

Error Handling

Useful tips and tricks for V2A ASR

  1. Monitor network utilization and activity in Windows Task Manager -> Resource Monitor. Filter the ASR executables like obengine.exe (Tick on executable in the Network or CPU tab)


  1. If the Failback or Reprotect phase fail on the Agent Installation phase, press retry after a few minutes. These errors can be due to DNS propagation delay
  2. If the error “Disks with the same VMware UUID cannot be attached to Master Target” is seen, it is likely the VMs were created from a template. Refer to VMware admins and VMware KB 2006865

SAP Azure Monitoring Agent

If a Disaster Recovery Event occurs and it appears likely that the SAP systems will need to run for more than a few days in Azure then it is important that the VMs meet all the supportability requirements. Implement 2015553 – SAP on Microsoft Azure: Support prerequisites

In case of any issues refer to The Azure Monitoring Extension for SAP on Windows – Possible Error Codes and Their Solutions

Links

Guidance for Sizing SAP Solutions on Azure

New White Paper on Sizing SAP Solutions on Azure Public Cloud

Static IP, Reserved IP and Instance Level IP in Azure

http://blogs.msdn.com/b/lalitesh_kumar/archive/2014/10/06/static-ip-reserved-ip-and-instance-level-ip-in-azure.aspx

http://social.technet.microsoft.com/wiki/contents/articles/23447.how-to-assign-a-private-static-ip-to-an-azure-vm.aspx

Static IP (Private IP) for ARM Deployments:

https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-static-private-ip-arm-ps/

https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-static-private-ip-arm-pportal/

How to register a DNS server


Azure Site Recovery Blogs

http://azure.microsoft.com/en-us/services/site-recovery/ (Sign up for Free Trial)

http://blogs.technet.com/b/scvmm/archive/2014/10/30/monitoring-azure-site-recovery.aspx

http://azure.microsoft.com/en-us/documentation/services/site-recovery/

Win2012 R2 Hyper-V + SQL Server 2014 Free Trial Software

http://www.microsoft.com/en-us/evalcenter/evaluate-windows-server-2012-r2

http://www.microsoft.com/en-us/evalcenter/evaluate-sql-server-2014

Gartner Magic Quadrant for SQL Server

http://www.gartner.com/technology/reprints.do?id=1-237UHKQ&ct=141016&st=sb

Content from VMware, SAP and other sources reproduced in accordance with Fair Use criticism, comment, news reporting, teaching, scholarship, and research

SAP Notes

1928533 – SAP Applications on Azure: Supported Products and Azure VM types

1999351 – Troubleshooting Enhanced Azure Monitoring for SAP

1380654 – SAP support in public cloud environments

2015553 – SAP on Microsoft Azure: Support prerequisites

2039619 – SAP Applications on Microsoft Azure using the Oracle Database: Supported Products and Versions

1329848 – Oracle Support for Microsoft Hyper-V