Using the Office 365 Audit Log to Track Retention Labels

Office 365 Classification Audits

Office 365 Data Governance, Labels, and Protection

Office 365 retention labels (previously called classification labels) are part of the Office 365 data governance framework. They are applied by users to mark documents and messages to be kept for a certain retention period, or to simply add a visual clue to a document to show its importance, state of processing, or other status. These labels differ from sensitivity labels, used to indicate the importance of documents and messages and optimally apply protection (encryption) with rights management.

Auditing Classification

Every time a user assigns, changes, or removes, a retention label to a document, folder, or list item in a SharePoint Online or OneDrive for Business site (Figure 1), an Office 365 audit record captures the event.

Labeling SharePoint document
Figure 1: Applying a label to a SharePoint document (image credit: Tony Redmond)

Office 365 also captures audit records when auto-label policies (available in Office 365 E5 and the Advanced Data Governance SKU) scan SharePoint document libraries and classify documents based on keyword queries or sensitive data types.

You can review audit records for retention labels through the Audit log search in the Security and Compliance Center. Select “Changed compliance policy label” (Figure 2) as the target event for the search. A record for a classification action includes the site, the document name, the user, and the name of the label applied to the document (the DestinationLabel). If the document was previously classified with another label, that label is captured too (the SourceLabel).

Searching Classification
Figure 2: Choosing to search for classification events (image credit: Tony Redmond)

Retention labels also show up in Exchange Online, where they appear as personal retention tags. However, Exchange doesn’t generate audit records when users apply retention tags to messages.

GDPR Classification

Many Office 365 tenants use retention labels to mark documents that hold personal data that comes within the scope of the European Union General Data Protection Regulation (GDPR). For example, if your organization defined a label called “Personal Data,” you could then assign that label to files such as Excel worksheets holding personnel information, Word documents for annual performance reviews, and so on.

Microsoft uses the audit data captured for classification activities to generate two graphs in the GDPR dashboard of the Security and Compliance Center. The first graph shows how many labels users apply to documents over a 90-day window (the period that Office 365 keeps audit records for users with E3 licenses; those with E5 licenses have records kept for 365 days), while the second shows the top five labels used. And if you have Office 365 E5 licenses, you can use the Label Activity Explorer to view information about label usage.

Processing Office 365 Audit Records

Nice as it is to have some graphs, it’s a little more interesting to look behind the scenes and do our own processing. When in doubt, write some PowerShell.

Office 365 audit records are normalized across workloads to ensure that all records have a set of common fields (like the timestamp and user identifier). However, the most interesting data in these records is stored in a JSON-format element, which needs to be split into individual elements to make it easier to review and work with. To make things more complicated, the fields in the JSON element differ from workload to workload and (sometimes) action to action. In other words, you must know what you’re looking for.

Looking for Classification Events

I have a simple routine to extract and process Office 365 audit records that I adjust depending on the kind of record I’m dealing with. This code looks for ComplianceSettingChanged events for a date range, extracts the audit data, and puts the events into an ordered hash table. The label applied to the document is stored in the Label field.

$Records = (Search-UnifiedAuditLog -StartDate 22-Sep-2018 -EndDate 22-Nov-2018 -Operations ComplianceSettingChanged -ResultSize 1000)
If ($Records.Count -eq 0) {
   Write-Host "No compliance policy label records found." }
 Else {
   Write-Host "Processing" $Records.Count "audit records..."
   $Report = @()
   ForEach ($Rec in $Records) {
      $AuditData = ConvertFrom-Json $Rec.Auditdata
      $ReportLine = [PSCustomObject][Ordered]@{
           TimeStamp   = $AuditData.CreationTime
           User        = $AuditData.UserId
           Action      = $AuditData.Operation
           Label       = $AuditData.DestinationLabel
           OrgLabel    = $AuditData.SourceLabel
           Document    = $AuditData.DestinationFileName
           ItemType    = $AuditData.ItemType 
           Site        = $AuditData.SiteURL
           Folder      = $AuditData.SourceRelativeURL }
      $Report += $ReportLine  }}

With a populated hash table, we can group, sort, and output table as we need. For instance, to know how many times individual labels are used to classify documents, we can do this:

$GroupData = $Report | ? {$_.ItemType -eq  "File"} | Group-Object -Property Label 
$GroupData | Sort Count -Desc | Select @{n="Classification Label"; e={$_.Name}}, Count

Classification Label                 Count
--------------------                 -----
Office 365 for IT Pros eBook Content    47
Approved                                29
Draft                                   18
Published                               12
Audit Material                           9
                                         2
Contractual Information                  1

The filter removes records for retention labels applied to list items or folders. The blank entry in the list is for audit events logged when a user removes a retention label from a document. Another point to note about document classification events is that SharePoint can assign a default label to documents added to a library. When this happens, SharePoint inserts the label name into the UserId property for the audit event where a user principal name is normally found.

To see who’s applying labels, we change the grouping like this:

$GroupData = $Report | Group-Object -Property User
$GroupData | Sort Count -Desc | Select @{n="User"; e={$_.Name}}, Count

User                                 Count
----                                 -----
[email protected]        99
Office 365 for IT Pros eBook Content    18
d300edac-7b62-4214-9dc5-ebf693d5179f     1

Clearly, I am very active in applying labels while the other users in the tenant still aren’t in the habit of using labels. The “Office 365 for IT Pros eBook Content” entry is for labels applied by SharePoint because a document library has a default label, as explained above. The solitary audit record for the GUID seems to be a glitch when SharePoint recorded that someone updated a document with a different classification but forgot to capture the user name.

Finally, let’s find out when documents were classified with a different label. We can scan our hash table for records where a source label exists, and output details as follows. The sample record shows that the document classification was changed from Draft to Published.

$Report | ? {$_.OrgLabel -ne $Null} | Select TimeStamp, Document, User, Site, Folder, @{n=”Original classification”; e={$_.Orglabel}}, @{n="New classification"; e={$_.Label}}

TimeStamp               : 2018-11-21T14:48:36
Document                : Support for Guest Access in Planner Rolls Out.docx
User                    : [email protected]
Site                    : https://office365itpros.sharepoint.com/sites/projects/
Folder                  : Shared Documents/Blog Posts
Original classification : Draft
New classification      : Published

Understanding Classifications

Some might ask why spend so much effort to find out who’s classifying documents and what are the most popular labels. Well, if you create retention labels, you’re probably interested in knowing whether what you did was useful. And if you didn’t, you don’t care. But at least you know that Office 365 audit records will be there if you care to look. Which is always a good thing.

Follow Tony on Twitter @12Knocksinna.

Want to know more about how to manage Office 365? Find what you need to know in “Office 365 for IT Pros”, the most comprehensive eBook covering all aspects of Office 365. Available in PDF and EPUB formats (suitable for iBooks) or for Amazon Kindle.