GDPR Came and the World Didn’t Stop
GDPR went live on May 25 and no puppies, kittens, or other small animals were harmed, rather like what happened when Y2K happened. But the effect of GDPR is more important because of the influence it has over how companies protect and manage personal data.
One of Microsoft’s responses to GDPR was the introduction of a GDPR Data Loss Prevention (DLP) template policy together with a set of GDPR sensitive data type definitions. The idea is that Office 365 tenants can use the template to create a DLP policy to protect against the inadvertent transmission of sensitive data that comes within the scope of GDPR outside the organization.
Two Types of DLP
DLP comes in two flavors inside Office 365:
- DLP as originally introduced in Exchange 2013. This variant uses transport rules to make sure that emails that violate the tenant policy cannot be sent. The transport block is backed up by client-side checking in Outlook desktop and OWA to detect potential violations as users compose messages. The system works well and has some unique characteristics like support for document fingerprinting, but it is limited to email.
- Office 365 DLP policies support Exchange, SharePoint, and OneDrive for Business – with the potential for support in Teams soon. Because of its multi-workload support, Microsoft is putting its weight behind this variant as the go-forward choice for Office 365. If you want to deploy DLP from scratch, you should use Office 365 DLP policies, and if you already use Exchange-based DLP, you should move to the Office 365 variant once it delivers equivalent functionality.
The GDPR template policy is only available for Office 365 DLP. You can create your own version for Exchange-based DLP, but you will have to define custom sensitive data types for use by Exchange. One approach is to create document fingerprints of government-issued documents like tax forms. This will stop people circulating similar documents via email outside the organization, but it won’t stop behavior like including a batch of passport numbers or driving license details in a message.
In any case, I deployed the GDPR DLP policy in my tenant and met some interesting situations that illustrate the difficulty of defining sensitive data types, even for Microsoft.
Email Header Violations
The first issue occurred when I noticed that many DLP policy violation alerts showed up in the Office 365 audit log. I use activity alerts to flag these events and choose to receive email when Office 365 detects a violation. All the events occurred when SharePoint Online imported email for storage – in this case, via an email address that delivered messages to a Teams channel. When you send email to a Teams channel, Teams displays the text in the channel (if the message is small) and captures the message in the SharePoint document library belonging to the team.
Where Exchange uses transport rules to block DLP violations, SharePoint and OneDrive for Business depend on the SharePoint crawler to detect violations and mark problematic files to prevent them being shared. The DLP policy rule fired when the crawler examined the imported messages because some of the message header information matched the definition for a European tax identification number (TIN). European TINs differ from country to country across the 28-state European Union, so it is difficult to come up with a definition that can accurately match TINs.
Email headers often include 8- or 9- digit strings that resemble a TIN, and because the definition was a tad imprecise, it caused Office 365 to detect a violation. Microsoft has since updated the data definition and it doesn’t seem to have the same problem.
Next, I ran into a problem when I received email from Poland. When I replied, Exchange detected a GDPR DLP policy violation and rejected the message. Upon examination, I noticed that the original message included some company information in an auto-signature (maybe inserted by Code Two, a Polish company specializing in Exchange auto-signature management).
European companies often include registration numbers or similar information in email auto-signatures, sometimes because they are required to by local regulations. For example, UK limited companies often include company registration numbers.
In the extract from the signature shown below, we see a 10-digit KRS (business register), 10-digit NIP (tax registration number), and a 7-digit with two characters Kapitał zakładowy (initial capital). The NIP is the most obvious source of the violation.
Sąd Rejonowy dla m. st. Warszawy, XII Wydział Gospodarczy KRS 0000282268
NIP: PL 5260012190 | Kapitał zakładowy: 4 085 500 zł
When I removed the auto-signature from the reply, Exchange sent the message. On the surface, DLP worked as it should because it detected and blocked an attempt to send a tax number outside the tenant.
The same problem happened a day later, this time when DLP considered a Spanish phone number to be like a TIN.
Obviously, blocking perfectly legitimate business email because they contain telephone numbers is a bad thing, and Microsoft is now rolling out a fix to make sure that people can continue to use email auto-signatures with telephone numbers.
Even if it needs Tweaking, DLP is Good
I like DLP a lot, which is why DLP is used in my tenant. Exchange-based DLP has its strengths and some unique characteristics but will be replaced eventually by Office 365 DLP. The difficulty involved in accurate detection of sensitive data types mean that you can expect some hiccups along the way, but that shouldn’t put you off including DLP as part of your Office 365 data governance strategy.
Follow Tony on Twitter @12Knocksinna.
Want to know more about how to manage Office 365? Find what you need to know in “Office 365 for IT Pros”, the most comprehensive eBook covering all aspects of Office 365. Available in PDF and EPUB formats (suitable for iBooks) or for Amazon Kindle.