DLP Analytics has the potential to transform the way that you protect your organization from data loss by identifying and closing data security risks. Let’s take a closer look and explore some of the advantages to enabling this great feature.
What is DLP Analytics?
百度 走在这里,贡嘎山的高俊,还有周围美丽的海子,森林,鲜花,草地,一切的一切都是难以言说的秀丽。At a high level, DLP Analytics utilizes machine learning to:
- Analyse your organization’s information sharing activities,
- Identify your organization’s highest sharing risks, and
- Generate new DLP policies or updates to existing policies to help mitigate risks.
To understand why this is important, we first need to consider how most organizations deploy DLP. It’s very common to turn on one or more Purview DLP policy templates, maybe add a policy or two targeting items with sensitivity labels applied and then call it a day. This might provide a great starting point but should in no way be considered a mature approach to data loss prevention.
Most templated DLP policies utilize sensitive information types (SITs), which are pattern or keyword-based methods of identifying information. They can do a great job in certain situations but are far from foolproof. SITs may fail to identify data when its format changes. For example, if a value used in a license or passport number changes due to a format change by the issuing authority. Effective use of SITs also requires a high level of administrator understanding as often combinations of SITs are required to successfully match items. For example, logic to effectively identify credit card numbers should look for credit card number AND full name AND verification AND date. Just looking for a number will lead to false positives as this may align with other data types. Establishing requirements and logic for such matching can be quite tricky and sometimes leads to misconfigurations.
Microsoft Purview offers more advanced methods of data classification, including trainable classifiers. Trainable classifiers use machine learning to build a prediction model and then apply it to information. Rather than just pattern matching, they can be used to gauge sentiment or the intent of an item or piece of information, which can result in more accurate matching. Organizations can build their own classifiers by feeding them example items that align with the type of information that they are looking to detect.
Organizations do not need to build their own classifiers to utilize these capabilities as Microsoft provides a long list of pretrained classifiers that are already tuned and ready for use. Some great examples of these are classifiers aligning with health/medical forms, financial statements, regulatory collusion or stock manipulation. There are also some great behavioural classifiers that can detect customer complaints, discrimination, threat or harassment.
So, how is this relevant to DLP Analytics? Well, to put it simply, if DLP Analytics can see that your current policies are not doing the job, it will recommend new policies or improvements to existing policies. It does this by looking through recent sharing activities.
- If it sees data being shared that most organizations would typically block, it will provide you with a recommended policy to stop the data loss.
- If it sees files aligned with a trainable classifier that are being shared, and a policy that looks to have the intention of blocking a SIT that aligns with the classifier, it will suggest adding the classifier to the SIT based policy. For example, a DLP policy blocking medical information via a SIT, but medical information identified via health forms classifier, actively being shared.
DLP Analytics provides recommendations aligning with these two categories of mitigations through ‘Policy Improvement’ and ‘Risk Spotlighting’ tiles that are available from the DLP Overview page of the Purview console:
Following the recommendations provided by DLP Analytics has the potential to close major holes in your organization’s data security configuration. Consider the following DLP enhancement workflow:
- A sensitive file is shared.
- Existing DLP policies will apply to the sensitive file.
- The sharing activity is written to audit log.
- DLP Analytics can see the sharing activity in the audit log and may determine that the file aligns with a sensitive trainable classifier.
- DLP Analytics can provide recommendations back to administrators to enhance the existing DLP policy by adding an aligned classifier.
Why enable DLP Analytics?
There are some great reasons why organizations should consider turning this capability on.
1. Industry best practice configuration
In addition to policy improvements, DLP Analytics will also recommend DLP policies that are considered industry best practices. This means that if you’re missing fundamental policies that are important to your organization’s data security, then the solution will be able to identify this and advise on how to rectify the situation. As both the Purview DLP solution and types of information being used evolve, DLP Analytics will be able to provide fresh recommendations to keep your organization on the front foot.
2. One-click DLP Policy creation to address data security risks
If risky data security anomalies are identified, then DLP Analytics will generate DLP policies that can be enabled to mitigate the identified risks. Generated policies are presented to administrators who can tweak them to provide the desired level of business impact. For example, policies could be configured to block all aligning activity, require business justification, show a policy tip, or just monitor/report on the activity:
3. DLP Policy Validation
DLP Analytics provides a method of validating your organization’s existing DLP configuration. For example, organizations that have made a commitment to protecting employee health information may have configured a set of DLP policies utilizing sensitive information types, to prevent exfiltration of this type of information. In this situation, if DLP Analytics were to identify that a high volume of items aligning with classifiers for healthcare, or health/medical forms were being shared, then it would be a good indication that the implemented DLP policies may need refinement.
4. DLP Policy Improvement
DLP Analytics can identify potential improvements to deployed DLP policies. For example, a policy that is protecting information via a set of sensitive information types, could be generating false positives. DLP Analytics could investigate recent detections and recommend that the policy be supplemented with an aligned trainable classifier to help reduce false positives. For example:
Once enabled, DLP Analytics will provide administrators with recommendations every 7 days based on activities over the last 30 days. The service will facilitate the continual improvement of DLP configurations through continual cycles of validating the DLP configuration and providing improvement actions:
Enablement Considerations
It’s hard to find a reason not to enable DLP Analytics as it should provide positive outcomes for all, or at the very least help to build confidence in your current DLP configuration. However, it is worth pausing to consider licensing requirements.
DLP Analytics is an E5 capability (at the time of writing) and will only provide recommendations to customers with E5 licensing available.
Organizations with split licensing scenarios (for example, a mix of E3 and E5) should consider the impact of some of the recommendations that DLP Analytics might provide.
The use of advanced classifiers, such as trainable classifiers, in Data Loss Prevention policies is currently an E5, E5 Compliance, or E5 Information Protection & Governance feature.
If all users are licensed for these capabilities (e.g., all users are E5), then you should have no problems enabling advanced classifiers in new or existing DLP policies.
However, if only some of your users have the required licensing to enable DLP Analytics, it might then assume that all of your users are licensed at the same level and recommend E5 capabilities on policies intended for users with lower licensing levels.
If licensing is a concern to you, then my advice is to stop and think before following DLP Analytics recommendations. For example:
- If a new policy is recommended that includes advanced classifiers, scope it to licensed users only.
- If an enhancement is recommended to a policy that is deployed to users of mixed license types, take a copy of the policy and maintain two versions of it; one for your E3 users and one for E5, with the E5 version enhanced via the advice that DLP Analytics provides.
For more information on the licensing of Microsoft Purview Data Loss Prevention features, see:
- Get started with trainable classifiers | Microsoft Learn
- Microsoft 365 E5 | M365 Maps
- Microsoft 365 Compliance Licensing Comparison
How to enable DLP Analytics?
DLP Analytics can be enabled via the Microsoft Purview portal under Data Loss Prevention> Overview.
Once enabled the service will take 7 days to generate insights and recommendations, which will then appear in the Overview tab of the Data Loss Prevention solution.
For more information on DLP Analytics, see Get Started with data loss prevention analytics | Microsoft Learn.