Stress Testing Azure Accelerated Networking

Posted on by Aidan Finn in Microsoft Azure

In this post, I will share the results of some stress tests that I ran on the Azure virtual machines with and without the Accelerated Networking feature enabled.

 

The Test Environment

I deployed 4x Azure Resource Manager (ARM) virtual machines in a single virtual network, all connected to the same single subnet. Each virtual machine had a single NIC with a public IP address. All were deployed with Standard tier managed disks. A single network security group was assigned to the virtual network.

The virtual network design, captured using Azure Network Monitor [Image Credit: Aidan Finn]
The Virtual Network Design, Captured Using Azure Network Monitor [Image Credit: Aidan Finn]
 

The four virtual machines were:

  • vm-petri2-an1: Accelerated Networking enabled
  • vm-petri2-an2: Accelerated Networking enabled
  • vm-petri2-sw1: Accelerated Networking not enabled
  • vm-petri2-sw2: Accelerated Networking not enabled

The purpose of having two sets of machines was:

  • Accelerated networking can only be enabled at the time of NIC and virtual machine creation.
  • I wanted to be able to compare results between two scenarios; one where the sender and receiver had Accelerated Networking enabled, and the other where both machines did not have Accelerated Networking enabled.

To make life easier for the tests, the Windows Firewall was disabled in the guest OS of each virtual machine.

The last configuration was to ensure that all the machines were spread onto different hosts; this was achieved by creating the machines in a single availability set. This ensured that the packets had to hit a physical network instead of being routed end-to-end inside of a single host:

Ensuring a valid test by placing all the #Azure virtual machines on different hosts [Image Credit: Aidan Finn]
Ensuring a Valid Test by Placing all the #Azure Virtual Machines on Different Hosts [Image Credit: Aidan Finn]

The Tests

To run the tests, I used the free tool from Microsoft, NTTTCP.EXE. This tool will stress the network connection by sending data to a receiver. Bandwidth and throughput are measured; unfortunately, as you’ll learn later, latency and jitter are not.

The x64 executable for NTTTCP was copied onto the sender and receiver and was executed on both.

The sender used:


The receiver uses:

Test 1 – DS4_v2

My first attempt at running this test was done by deploying the above virtual machines with the DS4_v2 size. This machine (8 cores) should be capable of reaching 6000Mbps, or roughly 5.8Gbps.

In my first test run, I ran the test between the machines with Accelerated Networking enabled, vm-petri2-an1 and vm-petri2-an2. While the test was running, I opened Task Manager to view the bandwidth. As hoped/expected, a bandwidth of 5.7Gbps to 5.8Gbps was being achieved. I was very happy! The test results are shared below:

Bandwidth stress test results on an Azure DS4_v2 with Accelerated Networking [Image Credit: Aidan Finn]
Bandwidth Stress Test Results on an Azure DS4_v2 with Accelerated Networking [Image Credit: Aidan Finn]
 

On to the machines without Accelerated Networking. Identical tests were run from vm-petri2-sw1 and vm-petri2-sw2. And the bandwidth achieved was roughly 5.8Gbps … wha-chu talkin’ ‘bout, Willis? Here are the results from that test. The results are roughly the same:

Bandwidth stress test results on an Azure DS4_v2 without Accelerated Networking [Image Credit: Aidan Finn]
Bandwidth Stress Test Results on an Azure DS4_v2 Without Accelerated Networking [Image Credit: Aidan Finn]
 

I was extremely surprised by this. I expected to see much lower bandwidth from the virtual machines without Accelerated Networking and my first (wrong) reaction was to think, “what is the point of Accelerated Networking?” I reached out to the MVP (Microsoft Valuable Professional) and Microsoft community and was reminded that Accelerated Networking (like Hyper-V SR-IOV) is about more than just bandwidth.

  • Without Accelerated Networking, we get pretty good bandwidth from each virtual machine. The number of cores [interrupt capacity] affects this.
  • Only when we are on the upper end of virtual machine sizes, will Accelerated Networking show a difference in bandwidth capabilities.
  • Accelerated Networking will reduce CPU utilization on the host, which we cannot see in Azure. This positively affects virtual machines by leaving them more CPU for services.
  • Jitter is reduced, which is important for streaming services.

In short, the DS4_v2 won’t show bandwidth improvements, but what my test didn’t and couldn’t show was the CPU utilization improvements at the host level and reductions in latency and jitter that would have been achieved.

Unfortunately, Windows Server doesn’t have a tool for measuring latency. Linux has qperf (which Microsoft has used in demonstrations) but I’m Linux-disabled! One might say, “try Ping” but the ICMP protocol is not optimized by Accelerated Networking. Latency testing using Ping would be useless.

So without a means to measure latency/jitter, I decided to upgrade my lab from DS4_v2 to the $2.43/hour or $1,778.76/month DS15_v2 (North Europe, RRP pricing).

Test 2 – DS15_v2

I shut down the four virtual machines, and resized them to DS15_v2, capable of up to 25,000Mbps or approximately 24.4Gbps. Then I re-ran the tests, starting with the machines that had Accelerated Networking enabled.

As the test ran, the bandwidth achieved fluctuated between 23.9Gbps and 24.4Gbps, normally sitting at around 24.1Gbps; I was happy that the numbers from the official virtual machine sizing documentation were valid. The NTTCP results were as follows:

Bandwidth stress test results on an Azure DS15_v2 with Accelerated Networking [Image Credit: Aidan Finn]
Bandwidth Stress Test Results on an Azure DS15_v2 with Accelerated Networking [Image Credit: Aidan Finn]
 

I then ran the same test between the two machines without Accelerated networking. While the test was running, the bandwidth fluctuated between 17.9Gbps to 19.6GBps but was regularly around 18.1Gbps. The results of the test are shown below:

Bandwidth stress test results on an Azure DS15_v2 without Accelerated Networking [Image Credit: Aidan Finn]
Bandwidth Stress Test Results on an Azure DS15_v2 Without Accelerated Networking [Image Credit: Aidan Finn]
 

When we compare those results:

  • Throughput (MB/s): Accelerated Networking improved throughput from 2226 MB/s to 2858 MB/s, which is an improvement of 28 percent.
  • Throughput (Buffers/s): Enabling Accelerated Networking gave me a 28 percent improvement.
  • Packets Sent and Packets Received: There were huge improvements here. Packets received, which I guess is more important than the sent statistic went up by 28 percent too.

28 percent seems to be a trend, so we can safely assume that enabling Accelerated Network, at no extra cost, improved the bandwidth/transmission performance of the DS15_v2 virtual machine. What I don’t know, is how latency, jitter, and host CPU were improved.

We can see in the results that CPU utilization went from around 18.5% percent to around 30 percent by pushing this traffic with Accelerated Networking enabled. That should be expected because a lot more interrupts are being handled in the guest OS. If that worries me, in a HPC scenario, then I’d look at one of the “R” enabled virtual machines (H-Series) that has an extra RDMA-capable NIC.

Analysis and Opinion

Enabling Accelerated Network when you create the NIC and virtual machine does have a positive effect. I typically turn it on if the virtual machine size supports it. The results might not be obvious in the machine sizes that I normally deploy or price up for customers, but they are there. Maybe if I can work up the courage, I’ll have another go at these tests with some Linux virtual machines where I can test latency using qperf.

If you are deploying larger machines, then enabling Accelerated Networking is a must-do. After the recent deployment of the Intel Meltdown security flaw mitigation in Azure, Microsoft even recommended enabling Accelerated Networking (redeploy the machine using existing disks). If you experienced a drop in performance, you benefit by the host having to do fewer context switches between User and Kernel mode when the flawed feature bypass impacts performance.

BECOME A PETRI MEMBER:

Don't have a login but want to join the conversation? Sign up for a Petri Account

Register