Windows Server 2016 Feature: ReFS Accelerated VHDX Operations
One of the more interesting features in Windows Server 2016 (WS2016) is Accelerated VHDX Operations, a feature that speeds up and reduces the impact of in-volume operations on VHDX and VHD files. This article will explain the benefits of this feature and show you how this feature will benefit you.
I used to work in the hosting business, using Windows Server 2008 and Windows Server 2008 R2 Hyper-V to run customer virtual machines. Part of our service offering was that we offered great performance. If you’re familiar with Hyper-V, then you know that we can use dynamic and fixed-sized virtual hard disks of either VHD or VHDX format.
- Dynamic disks: A disk that starts small and grows to your maximum defined size. This type is quick to deploy and consumes slightly more space than the amount of the data contained within. Performance usually degrades over time due to fragmentation.
- Fixed disks: A file the size of the disk’s maximum size is created. The contents are zeroed out as the disk is created, causing the creation process to be quite slow unless enhanced by storage systems. This type offers the best performance.
With every version of Hyper-V, Microsoft claims that dynamic disks can match the performance of fixed disks. And every time, some poor sucker falls for it and has to convert the database server’s data disks from dynamic to fixed when storage latency is unacceptable after a short amount of time. This is why, despite the fact that fixed disks consume space and are slow to create, I always use fixed disks for data and logging with production Hyper-V virtual machines.
Why is this relevant to my experience in the hosting business? I was a service provider — my customer would open a call, looking for a new machine or addtional space to meet some pressing need. I’d start the process of creating a VHD file and go for lunch … a long lunch … knowing that when I got back, the disk might be ready. That was hardly a fast response, and in this era of public or private cloud computing where service responsiveness dictates your employ-ability, being able to respond to business requests in a fast manner is very important.
Offloaded Data Transfer (ODX)
ODX was added in Windows Server 2012 (support in System Center 2012 R2 Virtual Machine Manager or SCVMM) to help with these scenarios. If a customer had an ODX-capable SAN, the SAN would offload data operations:
- Rapid data movement
- Creation and extension of VHD/X files
If I’d had ODX capable systems back in my hosting job, I could have deployed new disks in seconds instead of an entire lunch break. Deploying virtual machines from an on-SAN SCVMM library would have been much quicker than data transfer over the LAN. You can even move data quickly between SANs using ODX.
But there are some issues with ODX:
- Performance variation: I’ve seen that both NetApp and Dell SANs offer amazing performance enhancements with ODX. My tests with HP’s 3Par seemed to indicate that there was little to no improvement.
- Stability depends on the SAN OEM: Many customers found that their SAN OEM’s implementation of ODX was unstable and they had to disable the feature in their hosts to prevent crashes or data corruption.
- Software-defined storage: Microsoft is moving away from being dependent on the world of SANs. They needed a solution that works independently of a SAN, including (but not exclusive to) Storage Spaces Direct (S2D) and Scale-Out File Servers.
ReFS Adding Performance
Microsoft introduced ReFS with Windows Server 2012, as their data center file system of the future. Other than data integrity services, ReFS wasn’t ready for the big leagues. This is probably going to start changing with the release of Windows Server 2016. This is because ReFS adds a new feature called Accelerated VHDX Operations, which will improve VHDX and VHD functions, such as:
- Creating and extending a virtual hard disk
- Merging checkpoints (previously called Hyper-V snapshots)
- Backups, which are based on production checkpoints
One of the core features of ReFS is metadata, which is used in protecting data integrity. This metadata is used when creating or extending a virtual hard disk; instead of writing out zeroing out new blocks on disk, the file system will write metadata. The way to think of this is that when an application, such as Hyper-V, asks to read zeroed-out blocks, the file system checks the metadata, sees zeroes, and responds with “nothing to see here.”
Checkpoints can be costly in terms of IOPS, impacting all virtual machines on that LUN. When a checkpoint is merged, the last modified blocks of an AVHD/X file are written back into the parent VHD/X file. Hyper-V will use checkpoints to perform consistent backups (not relying on VSS at the volume level) to get greater scalability, but merging lots of checkpoints could be costly. Instead, ReFS will perform a metadata operation and delete unwanted data. This means that no data actually moves on the volume, and merging a checkpoint will also be much quicker and have less impact on services hosted on that disk.
I created a test rig using Technical Preview 3 (TPv3) of Windows Server 2016. Three machines are involved in the test:
- The hosts: A pair of Dell R420 servers with 10 GbE iWARP cluster networking and 1 GbE iSCSI connections. The speed of storage networking I deployed is pretty common, even in larger companies.
- Storage: A Windows Server 2012 R2 Hyper-V virtual machine that is hosted on a RAID5 array of 10 x 2 TB SATA drives. This virtual machine is running a version of StarWind Virtual SAN that does not offer ODX support (a current version supports ODX, but I have an older version installed).
Note that the bottleneck in performance is actually the 1 GbE networking, which is easily saturated by creating large virtual hard disks.
I decided to perform three tests using the iSCSI storage, to compare fixed sized disk creation on NTFS and ReFS volumes. Any volume I used was formatted with 64 KB allocation unit size, which is optimal for Hyper-V. Each test would measure the time it takes to create a series of fixed size VHDX files:
- 1 GB
- 10 GB
- 100 GB
- 500 GB
Test 1 – Non-Clustered Host
I created two disks on the SAN and zoned them to a non-clustered host as:
- D: Formatted with ReFS
- E: Formatted with NTFS
I ran the first test against the NTFS volume. Note that a 500 GB drive, nothing outrageous, took 77 minutes or a long lunch to create.
I then ran the same tests against the volume that was formatted with ReFS. Creating the same 500 GB VHDX file took just 2 seconds, 8 seconds faster than creating a 1 GB disk on NTFS!
Test 2 – Host-Owned CSV
In the remaining tests I used a Hyper-V cluster. The two drives on the SAN were zoned to both Hyper-V nodes and added as Cluster Shared Volumes (CSVs). A CSV can be owned, and the owner of the CSV always offers the best performance of metadata operations on that volume. I decided to compare the performance of an NTFS formatted CSV with that of a ReFS CSV, with both volumes owned by the host that was creating the VHDX files.
The performance of the CSV with NTFS was slightly worse than that of the NTFS non-CSV drive. A 500 GB drive took 79 minutes to create — that’s enough time to catch a movie during lunch.
The ReFS-formatted CSV offered much greater performance once again, with a 500 GB VHDX file taking just over 3 seconds, barely more than it took with a comparable dedicated (non-CSV) volume.
Test 3 – CSV Owned by Other Host
I wanted to see if transferring the ownership of each CSV, thus causing redirected IO, would create much more impact. My cluster networks are based on iWARP, which offers 10 Gbps of bandwidth with RDMA; RDMA gives low latency connectivity between hosts so my connectivity would be pretty good — redirected IO uses SMB Multichannel and SMB Direct, so that’s 20 Gbps of low latency connectivity between the hosts.
There is an increase in the time required. With a locally owned NTFS-formatted CSV it took 79 minutes to create a 500 GB VHDX file. When we move ownership to a different host in the cluster, it takes 82.6 minutes; it’s not a huge increase, but it is an increase, and one has to assume that this increase on a cluster with 1 GbE cluster networks would be much higher.
The ReFS volume did degrade in performance … by a few seconds, increasing from 3 seconds to 12 seconds. Will anyone complain much that creating a 500 GB fixed-size VHDX file it takes slightly longer than the 100m sprint at the Olympics?
ODX or ReFS?
My Hyper-V MVP colleague, Didier Van Hoye (@workinghardinit) did some similar tests to mine, comparing ReFS Accelerated VHDX Operations and ODX on a Dell Compellent SAN (one of the better implementations of ODX). It was clear that while ODX offers huge improvements over simpler storage, ReFS was much quicker at creating fixed VHDX files.
If you have a SAN that supports ODX, then don’t dump it just yet. ReFS is limited to operations within a volume, whereas ODX can optimize transfers between volumes and even SANs. For example, creating a new virtual machine from an SCVMM library template can be optimized by ODX but ReFS can offer nothing for you.
At this point, I don’t know if ODX and ReFS will live nicely beside each other, but if they do, then the technologies should offer the best of both worlds.
Become a Storage Genius
ReFS Accelerated VHDX Operations might be less acclaimed than other features like containers, nested virtualization, or rolling cluster upgrades, but this is a feature that is going to:
- Save you time (money)
- Make your storage more responsive to customers
- Reduce the impact of disk operations on services
In other words, just by formatting a volume with ReFS you are going to look like a storage genius!