Microsoft announced several infrastructure-related feature enhancements for Azure Storage at the recent Build conference. I will discuss what these features are and what we might expect from them.
The big news first, Microsoft is starting a preview for 4TB disks for Azure virtual machines. This preview is starting off in the West Central US region and will expand globally in June. Administration is limited to PowerShell and CLI initially. Support will be coming to the Azure Portal soon after.
The addition of support for 4TB disks will make people very happy. Today, Azure virtual hard disks are limited to 1023GB. This is not a restriction from the VHD format or Hyper-V. It is actually a restriction from the bespoke storage system that Azure offers to the compute service.
Standard and Premium storage will be supported. Note that when using flash Premium storage, larger virtual hard disks offer faster performance so that the new size will offer 7,500 IOPS or 5,000 for 1023GB. It also offers a data transfer of 250MBps.
When we deploy virtual machines to Azure, we are not restricted to 1023GB data volumes. Of course, we are also not restricted to 900GB data volumes in a physical server. We can use Storage Spaces inside the guest OS of the virtual machine to aggregate the capacity of individual data disks. An extra feature, we can aggregate the performance of the disks too, which makes the data volume faster.
There is still a need for larger data disks:
- Azure Site Recovery (ASR): Customers of ASR have pre-existing on-premises virtual machines with larger-than-4TB VHD/VHDX/VMDK disks. They want to replicate those disks without re-engineering the volumes to span multiple virtual hard disks.
- Bigger data volumes: A virtual machine with support for four data disks can have a maximum data volume of nearly 16TB, instead of 3.99TB.
Larger Storage Accounts
If you are working with unmanaged disks, then larger disks require larger storage accounts. We are keeping more data than ever and some infrastructure services, such as ASR and Azure Backup, use storage accounts.
Microsoft has announced that storage accounts, which are currently limited to 500TB each, will expand with a preview this summer of up to 5PB. Throughput will also increase:
- 100k+ requests/second
- 100+Gbps ingress and egress bandwidth
Hopefully, these speed improvements will benefit backup and restore times for virtual machines and item-levels in Azure Backup.
Microsoft plans to push these limits in the first phase of the preview and get the limits pushed even higher. The goal is to improve performance and to simplify management with fewer storage accounts.
Tiered Storage Accounts
Today, we can deploy two kinds of storage accounts:
- General usage
A general usage storage account can be used for all kinds, such as virtual hard disks like Azure Files. This is not a file server in the cloud. It can also be used for Standard IO – Block Blob, which is a form of blob or at-rest file.
Blob storage allows us to choose a specific usage and pricing tier for blob storage. We can use hot blob for cost effective storage of frequently accessed data and cool blob storage for infrequently accessed data.
The problem with hot versus cool storage is that we do not always know the access rates of our data. Some data is accessed constantly, some not so much, which can change in unpredictable ways. We can change the tier of the storage account but the access rates of individual blobs can be different. As cheap as cool blob storage is, Amazon has a cheaper ice tier of storage.
Microsoft is moving from account-level tiering to blob-level tiering. This means that you do not manage the tier of the storage account anymore. Blobs reside in the most suitable tier. Hot and cool data can reside in the same storage account.
A third layer of tiering is being added. This is an archive tier where data is kept offline for cost effective long-term storage. The retrieval time of this data will not be instant like it is for hot and cool. It will be hours.
Systems will not have to change to use this new blob-based tiering storage account. APIs will remain consistent so that existing services can take advantages of these improvements. This is assuming that they are OK with read latency being hours for offline data.
The initial release of tiered blob storage will not have automated tiering. You can manually or programmatically move a blob between the tiers or you can write it directly to a tier. In the future, automated lifecycle management or automatic policy-based tiering will be added.
A preview release is coming soon and feedback can be sent to [email protected].
Storage Network Access Control
A storage account is an internet service. The storage account has a globally unique DNS name and is accessible to anyone with the name and an access key. The access key is long and strong but that is not enough for many customers. They want to restrict where a storage account can be accessed from.
Network access control for storage accounts will limit which Azure virtual networks or Internet/LAN IP addresses and ranges can access a storage account. This is good to know in case an access key is compromised.
This is in private preview now but will be coming to the rest of us in preview in the second half of 2017.