Measuring Latency and Throughput Between Azure Regions


If you want to implement disaster recovery, latency and throughput between your primary and secondary location are key factors. These determine how quickly and how much data you can transport to the secondary location, and hence how much data will be lost in case of failure of the primary location. More formally, the latency and throughput determine the possible recovery point objective. In this post I will walk through measuring latency and throughput between Azure regions, so you can determine the best configuration for your scenario.

Latency is determined by the speed of light and the number of infrastructure components between the source and destination. This means that the closer the primary and secondary location are, the lower the latency. The catch is that if the primary and secondary location are too close together, they could potentially be hit by the same (natural) disaster. When Microsoft picks locations for Azure regions, they are selected to be a safe distance apart for most major disasters (earthquake, flood, and so on).

Options for connecting virtual networks in different regions

In Azure there are three ways to connect virtual networks in different regions:

In this post I will focus on VPN and virtual network peering, but you could use the same method to measure latency and throughput in a configuration with ExpressRoute. Before I go into how you measure latency and throughput, let’s look at the architectural differences between the three options.

VPN

With VPN, you link two virtual networks together by deploying a VPN gateway in both virtual networks, and then set up a connection between the two, as shown below.


Diagram 1: Virtual networks VNET1 and VNET2 in different Azure regions connected via VPN

Cross-region virtual network peering

The downside of using a VPN is the need for VPN gateways, which incur cost, add latency, and limit bandwidth. Within a region, you can use virtual network peering instead, which connects virtual networks without a gateway. Cross-region virtual network peering recently went into preview, and this allows you to securely connect virtual networks across regions, effectively stretching your network across regions in Azure, as shown below.


Diagram 2: Virtual networks VNET1 and VNET2 in different Azure regions connected via virtual network peering (preview)

Express Route

ExpressRoute is a dedicated private connection from your on-premises location to Azure (and Office 365) data centers. You can link virtual networks to ExpressRoute, and with the ExpressRoute Premium feature you can link virtual networks from multiple regions, as shown below.


Diagram 3: Virtual networks VNET1 and VNET2 in different Azure regions connected via ExpressRoute

When you connect two virtual networks in different Azure regions with VPN or virtual network peering, the traffic between the two regions runs over the Microsoft backbone, and the traffic is routed as any normal IP-based traffic. Although we currently don’t offer an SLA on latency and throughput, the fact that the traffic stays within our backbone means latency and throughput typically stay within a certain range.

When using ExpressRoute (with the Premium Add On) this is mostly the same, but there is a subtle difference: the virtual networks aren’t connected directly. Each virtual network is connected to the same Microsoft Edge your Partner Edge is connected to. In most cases that won’t make a lot of difference, but I’ll illustrate the impact through an extreme case. Suppose your on-premises location in Australia is connected via ExpressRoute, and terminates at a Microsoft Edge in Australia. If you connect virtual networks in West Europe and North Europe to this ExpressRoute, the traffic between the virtual networks will now run over the Microsoft Edge in Australia.

Creating two virtual networks connected by VPN

Creating two virtual networks in different regions that are connected through VPN is quite easy using this ARM template from the Azure QuickStarts. You can download my parameter file here, so you can see what kind of values you need to fill in. I used IP ranges for the virtual networks as shown in diagram 1. Note that I used IP address ranges that don’t overlap. Also, the VPN VIPs will be different, as these are automatically assigned.

Deploying the template will take a while, mostly due to the VPN gateway configuration. You can continue with the next steps in the setup once the virtual networks have been created. You don’t have to wait for the VPN gateway configuration to finish.

Adding virtual machines

To measure latency and throughput, you need a virtual machine in both virtual networks. For the tests I’m using Windows Server 2016 on a D3_v2 virtual machine. A D3_v2 virtual machne has 4 CPUs, which may seem like overkill. However, the size of your virtual machine impacts the network bandwidth assigned to the virtual machine. By choosing a larger virtual machine I’m making sure I have some bandwidth to play with. Of course, if you are targeting a certain size machine for your workload, it makes sense to use that size to test with.

Note: you could select accelerated networking when deploying the virtual machine, but that feature mainly impacts networking within a single region.

Once the virtual machines are created, each has an internal IP address. If you used the subnet configuration I used, and deployed in Subnet-1 in each region, the internal IP addresses are 10.10.1.4 for the West Central US virtual network and 10.20.1.4 for the US West 2 virtual network.

Go to the first virtual machine in the portal, and connect to it with remote desktop protocol (RDP). Once the VPN gateway configuration is finished, you can test connectivity by using RDP to connect from the first virtual machine to the second virtual machine on the internal IP address.

  1. Press Ctrl+R.
  2. Enter mstsc.
  3. In the RDP dialog that pops up, enter the IP-address of the second virtual machine (in my case 10.20.1.4).
  4. Click Connect.
  5. Click Confirm.
  6. You should now be logged into the second virtual machine. If this doesn’t work, there may be a configuration error in the connection between the virtual networks, but be sure to check if you have the right IP-address, and try connecting from the portal to the second virtual machine like you did with the first virtual machine.

Measuring latency and throughput

The easiest way to measure latency and throughput is using PsPing. You can download PsPing from here. PsPing can be used for simple ping functionality without a server, but for latency and bandwidth tests you need to setup a server.

Start the PsPing server

  1. Open or switch to the remote desktop connection to one of the virtual machines you created.
  2. Open a command prompt and change directory to the location where you extracted PsPing.
  3. Type psping -f -s <ipaddress>:<port>, for example, psping -f -s 10.20.1.4:81 to listen on port 81 on a virtual machine that has a local IP address of 10.20.1.4.

Run the PsPing client

  1. Open or switch to the remote desktop connection to the other virtual machine.
  2. Open a command prompt and change directory to the location where you extracted PsPing.
  3. Type psping -l <message size> -n <number of messages> -f <ipaddress>:<port>, for example, psping -l 8k -n 1000 -f 10.20.1.4:81 to send 1000 messages of 8 kilobytes to port 81 on the virtual machine with IP address 10.20.1.4.

This will get you a measurement over VPN, but for reference we also want a measurement over the public IP address, bypassing the VPN. To enable this, you need to modify the network security group on the virtual machine running the server, so it doesn’t block traffic over the port PsPing is listening to (port 81 in my case). It also makes sense to use different size blocks, to see how that affects the latency.

To test the bandwidth, you can use the -b parameter with PsPing, as follows:

Psping -b -l <message size> -n <number of messages> -f <ipaddress>:<port>, for example, psping -l 8k -n 1000 -f 10.20.1.4:81 to send 1000 messages of 8 kilobytes.

Public IP vs. VPN gateway

The table below shows the differences between going over the public IP address versus going through the VPN gateway.

Public IP VPN
10000 x 8k
Avg 21.48 ms 23.47 ms
Min 21.15 ms 21.93 ms
Max 31.62 ms 148.56 ms
1000x 1000k
Avg 23.70 ms 38.31 ms
Min 23.15 ms 30.07 ms
Max 45.54 ms 80.20 ms

 

Disclaimer: Measurement results will vary between different Azure regions, time of day, and other factors. The maximum value, in particular, varies greatly from test to test. Please test thoroughly to understand the impact on your scenario.

With the small packet size the difference is only a few milliseconds on average, but the maximum latency may vary. The increased latency with a VPN gateway is caused by the additional hop of going over the gateway. The difference between connecting over the public IP address and through the VPN gateway becomes significant with larger requests. This is due to the bandwidth limitations of the VPN gateway. I used the Basic SKU, which has the least bandwidth and using a more powerful SKU (see About VPN Gateway) will likely have positive impact.

Note: For maximum throughput and minimum latency you can you can turn of encryption of the traffic between two virtual networks connected using VPN gateways. Do not use this unless you understand the security implications.

Measuring with cross-region virtual network peering

With the tests over the public IP and through the VPN done, it’s time to move on to cross-region peering. You could setup two entirely new virtual networks and virtual machines, but you can also remove the connection between the VPN gateways, and then setup virtual network peering. You don’t have to delete the VPN gateways, but you can. To remove the connections, take the following steps:

  1. Open the Azure portal.
  2. Open the resource group you created for the environment.
  3. Click the USWC-USW2 connection.
  4. Click Delete.
  5. Close the blade.
  6. Repeat steps 3-5 for the USW2-USWC connection.

To configure virtual network peering, take the following steps:

  1. Open the Azure portal.
  2. Open the resource group you created for the environment.
  3. Click the USWC-VNET connection.
  4. Click Peerings.
  5. Click Add.
  6. Create a peering to the USW2-VNET, as shown below.

    Note that if you don’t see the virtual network in the West US 2 region, virtual network peering is not enabled yet for your subscription. You can register for the public preview here.
  7. Provide a name and select the virtual network to peer to, the click OK.
  8. Repeat steps 2 through 7 to create a connection from the USW2-VNET back to the USWC-VNET.

Public IP vs. cross-region virtual network peering

The table below shows the differences between going over the public IP address versus going through virtual network peering.

Public IP Virtual network peering
10000 x 8k
Avg 21.48 ms 21.29 ms
Min 21.15 ms 21.03 ms
Max 31.62 ms 27.68 ms
1000x 1000k
Avg 23.70 ms 23.55 ms
Min 23.15 ms 22.81 ms
Max 45.54 ms 45.40 ms

The difference between public IP and cross-region virtual network peering is only a few microseconds, and as such negligible. This makes sense, as the traffic pretty much takes the same physical route through the Microsoft backbone.

Conclusion

Using PsPing you can measure the latency and bandwidth characteristics between different regions. This will help you determine the best options for scenarios such as disaster recovery, and how you should configure synchronization between systems in different regions.

Cross-region virtual network peering is a great addition to Azure networking. It provides lower latency and higher throughput than connections over VPN gateway. Once it becomes generally available, it will be the preferred method to peer networks in different regions.

--- Thanks to Steffen Vorein and RoAnn Corbisier for editing. ---


Comments (0)

Skip to main content