Information Protection Automatic Labelling

One of the features of Microsoft Information Protection is the ability for a user to classify a document using a sensitivity label. From a predefined list of sensitivity labels provided by the organisation, the user can select the desired label in the Office apps:

This process requires the users to understand the sensitivity labels and when to use them. Label tooltips can be configured to assist with this. Mistakes could be made even if the users understand the labels and their intended usage scenarios.

Automatic application of sensitivity labels

There may be scenarios where information needs to be labelled automatically rather than just relying on a user to select the correct sensitivity label. Microsoft Information Protection offers the ability to recommend that a specific sensitivity label is applied to a document when defined sensitive information is detected in that document. The policy to do this can also be configured to automatically apply a specific sensitivity label, not just recommend it.

For example, suppose that an organisation has created a custom sensitive information type in Office 365 to detect information regarding they keyword ‘Project X’. It has also configured its sensitivity labels to recommend that the desired label is applied to documents containing Project X sensitive information. An example of Microsoft Word recommending a specific sensitivity label is shown below.

Note that, if a label is automatically applied or recommended to be applied, the user can accept or reject the label. In label rejection scenarios, it is possible to configure the sensitivity labels to require a justification from the user for lowering the label classification.

Sensitivity labels and SharePoint Online

A new feature currently in public preview allows SharePoint Online to recognise sensitivity labels applied to Office files. For example, below you can see a file labelled as ‘Confidential’ stored in a SharePoint Online document library. Note the use of the sensitivity column and the tooltip showing that this file has been manually labelled.

The tooltip can also show when files have been automatically labelled. In the example below, a sensitivity label was automatically applied to the file based on specific sensitive information being detected in that file.

Automatic labelling at scale

Organisations are generating more and more data. If an organisation has many thousands or hundreds of thousands of documents, labelling these with an appropriate sensitivity label can be a challenge. There needs to be a way of labelling files at scale without the need for a user to open each file and label it.

Another new feature in public preview is auto-labelling policies, a service-side labelling feature for files stored at rest in SharePoint Online and OneDrive for Business. It also supports email in transit in Exchange Online. Microsoft states that this feature does not include emails at rest.

For on-premises data, Microsoft offers the Microsoft Information Protection Scanner. This may be the feature of another article. For now, this article will focus on data in SharePoint Online.

Auto-labelling policies. How to label at scale in SharePoint Online

Consider the scenario where a user creates or uploads a file containing sensitive financial information into a SharePoint Online document library. This file has not been manually labelled with a sensitivity label by the user.

An auto-labelling policy can be created to help protect this data. The policy configuration first requires the identification of the sensitive information that should be detected in files, together with any other conditions – such as the files having been shared externally, for example. If that sensitive information is found, the sensitivity label specified in the policy will be applied to those files. The policy configuration can be adjusted in a similar manner to Data Loss Prevention policies, via changes to accuracy and instance counts of the sensitive information as well as other conditions.

As with other elements of information protection, scenario and configuration planning is a key consideration. If the sensitivity label specified in the policy includes encryption, there exists the capability of many files being automatically encrypted so it is critical to ensure policies are planned carefully.

To help with understanding the effects of the policies, auto-labelling policies are first created to run in simulation mode. This is helpful to ensure that the label will be applied correctly. The simulation needs to run for 24 hours before the Turn on policy button becomes available:

Items matched by the policy can be viewed in the Matched items tab. An example is shown below where the spreadsheet containing credit card information that was uploaded by a user has been detected.

When there is confidence that the policy configuration is correct, it can be turned on to apply sensitivity labels to matching files. Let us look again at the scenario where a user has uploaded a file containing sensitive financial information into a SharePoint Online document library. After the policy has been turned on, the file has now been automatically labelled with the sensitivity label defined in the auto-labelling policy.

Remember, features such as auto-labelling policies and sensitivity labels for Office files in SharePoint Online are in public preview – Microsoft warns that these are subject to change. When auto-labelling policies are made generally available, they will require careful planning if they are to be implemented. This article is aimed at giving a general introduction to the automatic labelling capabilities in Microsoft Information Protection. There are many additional factors to understand, particularly in auto-labelling policies, such as what happens to content that was already manually labelled and how multiple conditions are evaluated.