Windows Server and Hyper-V include support for a number of hardware offloads to improve the performance of networking. These offloads reduce the resource consumption of a server, host, or even a virtual machine, and they make it possible for these computers to scale up their workloads. In this article I will introduce the significant hardware offloads and explain what they offer. You can then use this information to steer your host or server designs.
The Role of a Hardware Offload
Windows Server and Hyper-V can function without networking hardware offloads. However, you get to a point where as you scale up your workloads, the requirements of growing services place an increased demand on the hardware. This can reduce performance and even scalability.
For example, you might wish to use very dense Hyper-V hosts because your cost-of-ownership calculations determine that this is the best way forward. This means that there will be a huge flow of networking traffic into the host in question. Without assistance, a significant percentage of the host’s resources will be consumed by that networking traffic, thus reducing the virtual machine density that is possible on the host.
The normal flow of traffic through a Type 1 hypervisor such as Hyper-V or vSphere introduces a tiny amount of latency. You might have services that require network latency, lower than what an unassisted hypervisor can offer. Once again, without an offload, you might have no choice but to deploy these services on expensive, inflexible, difficult-to-manage, and non-cloud physical servers.
Hardware offloads for networking can improve performance of services and reduce the resource utilization of the physical servers that they are enabled on. This can increase virtualization density, positively impact service responsiveness, and make virtualization acceptable in some niche cases.
Understanding Hardware Offloads
You cannot just blindly install and use hardware offloads. Some offloads will require very specialized hardware, while others have very limited support. Also, there may be compatibility issues between closely related features.
Remote Direct Memory Access (RDMA)
RDMA is actually an old hardware offload that was mostly forgotten about until the initial announcement of Windows Server 8, which later was released as Windows Server 2012 (WS2012). The purpose of RDMA is to allow a server to process huge amounts of traffic while minimizing physical processor utlization. The traffic bypasses much of the networking stack in Windows Server and is processed by a NIC that offers RDMA support. In fact, the NIC probably has a processor that looks like something from the motherboard of an older PC.
RDMA is supported by WS2012 and Windows Server 2012 R2 (WS2012 R2) on the following kinds of NIC:
- iWARP: A relatively affordable 10 GbE NIC. It is strongly recommended that Datacenter Bridging (DCB) is enabled on the NIC and the switches to implement Quality of Service (QoS) if you converge networks on these NICs.
- RDMA over Converged Ethernet (RoCE): This type of NIC can run at 10 Gbps or 40 Gbps. You will require DCB, but you must also enable Priority Flow Control (PFC) if converging networks on this NIC.
- Infiniband: This specialized form of data center network can run at 56 Gbps, and there is talk of a 100 Gbps format on the way.
RDMA is used by WS2012 and later to accelerate SMB 3.0 traffic. This was originally used just for storage traffic, but it is also used for:
- Cluster Shared Volume Redirected IO: The much-maligned Redirected IO is no longer used when backing up CSVs with WS2012 or later. However, it is used for brief metadata operations (starting a Hyper-V VM, creating or extending a virtual hard disk, and so on) and for storage path fault tolerance (a host loses connectivity to the place where VMs are stored and it redirects IO through the CSV owner to the CSV to prevent VM outage).
- WS2012 R2 Live Migration: It is recommended that you enable SMB Live Migration on WS2012 R2 when you have faster than 1 Gbps networking or Live Migration networks. Live Migration will use host processor capacity but enabling RDMA networking reduces this significantly.
Receive Side Scaling (RSS)
Most people won’t know this, but the processing of inbound networking in a machine is limited to core 0 (not logical processor 0) of processor 0 in that machine. This can lead to the following happening:
- A server, such as an SMB 3.0 file server, is capable of receiving huge amount of inbound traffic.
- Core 0 is the only core used to process inbound traffic.
- Eventually the high amount of traffic arrives. Core 0 is 100-percent utilized, but all other cores in the server remain idle.
- The scalability of the services on the server are impacted because only so much inbound traffic can be processed. Without RSS, more servers are required.
With RSS enabled, a base processor is selected. This is done automatically by Windows, but you can override the configuration using PowerShell. This base processor could be core 1, or logical processor 2 if Hyperthreading is enabled. RSS will then use this core or logical processor as the starting point for processing inbound traffic. If the loads increase then RSS will scale the workload out across available cores or logical processors, thus increasing the capacity of the server/services without the limitations that were otherwise imposed by a single core.
One of the services that makes use of RSS is SMB 3.0; this gives us one of the features of SMB Multichannel where a single NIC can have multiple parallel threads of communication, thus making better use of the bandwidth that is available.
Dynamic Virtual Machine Queue (dVMQ)
RSS is considered an offload for physical networking; in other words, it enhances the scalability of non-virtual NIC traffic. dVMQ is the cousin of RSS – it enhances the scalability of traffic into a host that is destined for virtual NICs. Without dVMQ, core 0 is once again the bottleneck. With dVMQ enabled a host is capable of handling much more traffic that is destined to a virtual NIC.
dVMQ and RSS actually use the same queues or circuitry on a NIC. That means that you cannot use dVMQ and RSS at the same time on a single NIC. You will notice in the two illustrated examples that I have shown two pairs of NICs in a Hyper-V host. The first pair of NICs has RSS enabled and is used to accelerate SMB traffic to the host from an SMB 3.0 file server. The second pair of NICs is used primarily for the virtual machines; dVMQ is enabled to increase host networking capacity.
You need to keep the RSS versus dVMQ balance in mind when designing Hyper-V converged networks hosts, especially when you are restricted to two NICs. In that case, you will use dVMQ (the traffic passes through a virtual switch) and you will not get the benefits of RSS for non-virtual traffic.
Virtual RSS (vRSS)
WS2012 R2 adds a new feature that brings the benefits of RSS to the guest operating system of virtual machines. With dVMQ enabled, you can enable vRSS in the advanced settings of a virtual NIC in the guest OS of a multi-virtual processor virtual machine. This will allow the processing of inbound traffic to that virtual machine to scale beyond virtual CPU 0 in the guest OS, and therefore allow the networking services of that virtual machine to scale out beyond the limitations of a single logical processor in the host.
To use vRSS the virtual machine must route traffic through a virtual switch that is connected to one or more dVMQ enabled physical NICs.
There is widespread support by NIC vendors for RSS and DVMQ. Consult manufacturer documentation for implementation guidance.
Some organizations may have a requirement to enable encryption of network traffic. IPsec allows Windows administrators to define policies to encrypt traffic based on predefined rules. This can be useful for regulatory compliance or for secrecy in a public or hosted private cloud. The problem with IPsec is that it uses physical processor resources. This sacrifice might have been acceptable on traditional physical servers, but with ever-increasing virtual machine density, a significant percentage of capacity could be lost to encryption and decryption computation.
WS2012 Hyper-V added support for IPsecTO. This allows a compatible NIC to perform the encryption and decryption processing on behalf of the host. This would reduce the demands on the processor made by IPsec rules within the guest OSs of the virtual machines, making that now-unused capacity available to other services or additional virtual machines.
Note that at this time very few NICs offer support for IPsecTO.
Single-Root IO Virtualization (SR-IOV)
A networking packet makes a “long” journey when it travels from the physical switch into a host on the way to a virtual machine:
- The packet arrives into the physical NIC on the host.
- It passes into the virtual switch, running in User Mode in the management OS.
- There the packet is filtered and routed out a virtual switch port.
- The packet passes back into Kernel Mode to a Virtualization Service Provider (VSP).
- The VMBus copies the packet into memory of the virtual machine in the Virtualization Service Client.
- Now the guest OS of the virtual machine can process the packet before responding across the same path.
At first this might sound like quite a road trip, but every type 1 hypervisor has something like this, and the latency created for the packet isn’t actually that much. Most networking services will never have a problem. However, this form of networking does have an impact.
- Latency: There can be some latency, even if it is tiny. This can dissuade a tiny minority of organizations from virtualization some services.
- Host Processor Ultization: In truly huge workloads, the involvement of the host CPUs in processing this route will limit scalability of services.
SR-IOV, added in WS2012 Hyper-V, does something strange in virtualization:
- The virtual NIC in a virtual machine uses a feature called a Virtual Function (VF) to connect directly to a Physical Function (PF) on a host’s physical NIC.
- A driver for the VF is installed in the guest OS of the virtual machine. Yes, you would install a driver for a physical device into a virtual machine!
- The traffic to/from the virtual machine completely bypasses the networking stack of the management OS, and therefore the virtual switch (and functions like per-virtual NIC QoS) and host NIC teaming (implement NIC teaming in the guest OS instead).
SR-IOV generated quite a bit of talk at the launch of WS2012, but it is a niche feature. In fact, some server manufacturers only support this feature in their top-end models. In reality, very few organizations will have a legitimate need to use this hardware offload.