Preview for NVIDIA-Powered Azure VMs Begins

Nvidia-Azure-hero
Microsoft has launched a preview for a new set of NVIDIA-powered virtual machines in Azure that can be used for compute-intensive and graphic-intensive workloads.

The Need for NVIDIA

To be quite frank, almost every sizing job that I have done for a customer has been satisfied with A-Series, D-Series, and sometimes F-Series Azure virtual machines. Rarely have I been asked about something more powerful. But there have been occasions where people have asked for something a little different. And a few times I have been asked if Microsoft had any virtual machines with NVIDIA chipsets.
The request for NVIDIA hardware wasn’t to power some game in the cloud – although maybe this could power game streaming. The first reason has to do with graphics acceleration. Some “desktop” workloads do make use of graphics processing units (GPUs). Two examples are:

  • Desktop virtualization: Remote Desktop Services and Citrix can leverage a GPU to provide a better graphical solution.
  • Graphics-intensive applications: I’ve been asked a few times about running applications like AutoCAD in Azure. This wasn’t possible because accessing AutoCAD remotely would require a powerful GPU on the remote server.

The second reason that GPUs are requested is because a GPU is a very powerful number cruncher. This makes it perfect for processing a lot of data, such as analyzing whale song (one of the most interesting scenarios that I’ve ever heard of, which I learned about at a Petri meetup at TechEd North America) or simulating cancer treatments. This is the sort of hardware solution you will find in a high performance computing (HPC) cluster.
Microsoft announced at AzureCon last year that they would release an N-Series of virtual machines that would be based on hosts with NVIDIA chipsets, and Microsoft launched the preview on August 4, 2016.

The N-Series Virtual Machines

The N-Series of virtual machines is actually 2 sets or series of Azure virtual machines, each focused on a different type of task.
The first set is the NC-Series of machines which are for CUDA or OpenCL compute intensive workloads. The hosts use Tesla K80 GPUs, which Microsoft claims is the “fastest computational GPU available in the cloud”, delivering 4992 CUDA cores with a dual-GPU design, and offering up to 2.91 Teraflops of double-precision computation and 8.74 Teraflops of single-precision computation.
In case you are curious, the Tesla K80 card (it looks more like a small computer) has 2 processors, and 24 GB of GDDR5 RAM, requires 2 PCIe slots (!), and is listed at $5,000.

The NVIDIA Tesla K80 GPU [Image Credit: NVIDIA]
The NVIDIA Tesla K80 GPU [Image Credit: NVIDIA]
There are currently (this is likely to change) three sizes of NC-Series virtual machine.
Sizes of the Azure NC-Series virtual machines [Image Credit: Microsoft]
Sizes of the Azure NC-Series virtual machines [Image Credit: Microsoft]
There is a second version of the NC24 virtual machine, called the NC24r; the “r” stands for Remote Direct Memory Access (RDMA). This virtual machine has a second virtual NIC with lower latency and high throughput networking, perfect for the large data transfers that are common in large HPC deployments, ensuring that parallel computing across many virtual machines is not slowed down by the network.

The second series of NVIDIA-powered virtual machines are the NV-Series, aimed at “data visualization” workloads, such as graphics-accelerated applications or desktop virtualization. The hosts are armed with NVIDIA Tesla M60 GPU, with 4096 CUDA cores and capable of up to 36 1080p H.265 streams. The card has 2 processors and 16 GB RAM. The following NV-Series virtual machine sizes are currently available.
Sizes of the Azure NV-Series virtual machines [Image Credit: Microsoft]
Sizes of the Azure NV-Series virtual machines [Image Credit: Microsoft]

Direct Device Assignment

Azure is using a feature called Direct Device Assignment (DDA) that allows a Hyper-V virtual machine to connect directly to a device, similarly to SR-IOV, instead of virtualizing the device. This gives the GPU better levels of performance than a virtual device can offer. Note that DDA is coming to on-premises Hyper-V in Windows Server 2016.

Accessibility

Microsoft’s ambition is to make HPC available to any scientist (research, data, etc.) that requires it. The computational power that these NVIDIA chipsets offer isn’t readily available because they are expensive. Thanks to the OPEX model of Azure, one can deploy a HPC cluster, perform a computation, and delete the cluster, and only pay for the resources that were used, while they were used.
The visualization workload capabilities of the NV-Series makes it possible to move graphical workloads to Azure, again, using the utility bill model.

Availability


The preview is limited, and you will have to register your interest in the program. According to the virtual machine pricing page, The NV- and NC-Series virtual machines are only available in the South Central US region; the pricing is shared below. Remember that this price includes the cost of the virtual machine (charged per minute, while the machine is running), Windows Server (no CALs required), and the share of the NVIDIA chipset(s) and host hardware that you are using.

Pricing of the Azure N-Series virtual machines [Image Credit: Microsoft]
Pricing of the Azure N-Series virtual machines [Image Credit: Microsoft]