Best Practices with Azure ARM Network Security Groups

Enterprise Network Hero
This post will explain some of the best practices for creating, configuring, and associating network security groups (NSGs) in Azure Resource Manager or CSP.

Reminder – What Are Network Security Groups

I have written two previous posts that will explain what NSGs are and how to deploy them:

In summary, a network security group is a policy that contains a collection of rules that block or allow traffic to and from a virtual NIC or a virtual network subnet in Azure. You can specify network addresses, locations, TCP/UDP/all protocols, and port numbers. Using a priority or weight value, you can stack rules. For example, a generic rule with a low value can block or allow everything, and more specific rules can override that, thanks to a higher priority.

Start with a Plan

Starting with a plan is a difficult conversation, especially when developers are granted the ability to deploy machines for themselves. There are those who think that devs are geniuses. In my experience, devs (and thus devops) are clueless when it comes to planning, security, and so on. But try, we must!
You need to plan out the infrastructure that you are going to deploy. This will impact your network design and your network security groups design. Document:

  • The virtual machines
  • The network rules that you will require
  • Your desired virtual network design – this will change!

Will You Use a DMZ?

Are you planning on implementing a DMZ or a perimeter network? There are a number of ways to accomplish this, all of which leverage NSGs to block or allow traffic. Some designs rely entirely on NSGs to filter traffic based on protocol and port. Others add a next-generation firewall for application layer inspection/policies, and this can be extended with multiple virtual network subnets and user-defined routing.

Choose a Scope of Association

You can associate an NSG as follows:

  • Virtual machine (Azure V1 / Classic / Service Management)
  • Virtual machine NIC (Azure V2 / Resource Manager / ARM / CSP)
  • The subnet of a virtual network

The best practice is to create 1 NSG per subnet and associate that NSG with only that subnet. The rule set within the subnet should affect the entire subnet and not just specific machines. Here is why:

  • Simplicity: 1 set of rules for each subnet. If you need to create a new set of rules for some machines, then put those machines in another subnet and create another NSG.
  • Predictability: When you make a change in 18 months’ time, you know what the scope of the change will be.

If you do decide to be silly and create complexity, then you need to ensure that one rule type doesn’t block another. For example, if you block TCP 80 into a subnet, but allow it into a NIC, the NIC rule will never get a chance to be applied.

Inbound traffic is evaluated in this order:

  1. Subnet
  2. NIC

Outbound traffic is evaluated in this order:

  1. NIC
  2. Subnet

Priorities

It is unlikely that you will think of every possibility when you are still in the planning stage. This next piece of advice isn’t constrained to the Microsoft world; you should increment the priority values of your rules by 100. For example:

  • Rule 1 has a priority of 100.
  • Rule 2 is 200.
  • Rule 300 comes in at 300.

Then when I test my service out and realize that I need a rule between 1 and 2, I can give it a priority of 150. This gives me further room to add further rules before and after the new rule. Note that you can use values from 1 to 4096.
If you are creating custom rules that are generic, such as block or allow all, you might want to start at the end of the range, such as 4000, and work your way up the scale in units of 100:

Don’t Modify Defaults

This is another not-specific-to-Microsoft piece of advice: Do not modify the default rules in your NSG. Think of these as a way to get back to the factory defaults. I find that it’s always best to create a new rule that overrides the low-priority default rules.

Deny All to Internet Is Dangerous

Denying all traffic to the Internet is a sledge hammer that can secure your network from data leakage or malware, but it can break things. A downside is that Azure virtual machines can intermittently require access to Azure IP addresses (which fall under the Internet tag) for essential services. If this traffic is blocked, then bad things can happen. Microsoft’s Keith Mayer has documented a solution for this.

Enable Diagnostics

You’ll probably find yourself in a situation where you need to troubleshoot how your NSG is impacting (or not) your traffic. If so, you should enable diagnostics. To do this, open the settings of the NSG, click Diagnostics, and set the status to On and select a storage account to save the logs to. You can then inspect the logs later if there is an error or you manage to recreate one.

Gateway Subnet

Do not apply NSGs to your gateway subnet. There is simply no need to do this, doing it will break things, and it is unsupported.