Cloud 101CircleEventsBlog
Get 50% off the Cloud Infrastructure Security training bundle with code 'unlock50advantage'

Crawl, Walk, and Run Your Way to More Effective Data Protection

Published 10/18/2023

Crawl, Walk, and Run Your Way to More Effective Data Protection

Originally published by CXO REvolutionaries

Written by Daan Huybregts, CTO in Residence, Zscaler.

Leverage a CASB to minimize data leakage

By now, most security professionals recognize that, as data loss prevention (DLP) solutions go, you can’t do better than a cloud access security broker (CASB).

That’s because CASBs stand logically between all users and all cloud-hosted services and apps, and thus provide a defensive layer of visibility, analysis, and control capable of both detecting and blocking sensitive data loss anywhere it occurs.

But it’s one thing to have a CASB. It’s a very different thing to get the most out of it. Configurations require thinking carefully about how to best create appropriate security policies that minimize not just false positives (believing you have a data loss problem when you really don’t) but also false negatives (believing data loss isn’t happening when it really is). Creating such policies can be tricky.

You can think of CASB policy creation and evolution in terms of the stages of human locomotion: crawl, walk, run.

Crawl is the most critical stage because it’s about creating apt policies based on your users, architecture, data classes, and other variables. Walk and run, meanwhile, are predictably about accelerating matters to establish more efficient (and therefore secure) parameters with the smallest possible windows of vulnerability.


Crawl: Getting from nowhere to somewhere to somewhere good

Let’s consider the crawl phase, which, being the most crucial phase, receives the most attention here.

The first principle to understand is that you want to pursue a top-down design that takes into account where your users are going, what they’re doing, and the relationship between data types and their job roles.

Context is all-important. Divide your apps/services into logical categories and develop policies based on the context of those categories. As you get deeper into the weeds of specific apps/services, refine policies to take that more specific context into account.

For instance, at many companies, apps can be divided into sanctioned apps (universally used and approved for all), departmental apps (used and approved only for members of certain departments), and unsanctioned apps (available, but not approved for everyone or all business contexts).

One of the great strengths of an unusually advanced CASB is that it can both discover and track the use of an app portfolio automatically, thus granting security managers deep insight into who is using which apps and what kinds of data tend to be involved. Obtaining this kind of intel, whether the CASB delivers it or not, is an essential first step in the top-down process of policy definition, and the best way to divide apps into logical categories.

It’s also wise to consider ways to reduce the total app portfolio. Over time, most organizations acquire more apps than they need. Since each app presents some form of attack surface, a smaller portfolio will lead to a smaller attack surface, a smaller number of policies, and faster and more comprehensive security.

Individual users play a surprisingly large role, too. It may be that at this stage, you discover that in accordance with the 80/20 rule, a small minority of users creates the large majority of actual or potential data leakage/loss. This is certainly something you’ll want to recognize and address, ideally through both technology and user education.


Evolve your policies to match the specific context

Now, let’s move on to the logical process of policy creation to illustrate how DLP policies can be refined based on context.

Suppose you want to recognize and block unauthorized transmission of credit card numbers, a critically important capability for both business purposes and regulatory compliance. How best to do that?

You might begin by looking for sequences of sixteen digits, since all credit card numbers have those. But, of course, that won’t suffice because there are other classes of sixteen digits, too.

A logical added criterion would be for the policy to look for matching text patterns falling within a certain logical proximity of the number sequence — for instance, keywords such as credit or CCN, occurring within X hundred bytes of the number sequence.

You could continue to add more criteria for even more refinement. Perhaps, for instance, you specify that this policy doesn’t apply to certain departments (or it does, but those departments can have up to X number of credit card numbers in a given file, whereas other departments are allowed Y or Z).

Real-time user education is an option well worth factoring into policies too. If a user tries to upload a document that the CASB flagged as probably containing credit card numbers to Google Drive, what should your policy do?

It might shut that down, or it might (depending on context) ask the user if this is really intended, and if some better destination – such as a sanctioned drive operating in a private cloud – might be more appropriate.

In every case, your goal is to minimize both false positives and false negatives. Ideally, if a user triggers a policy, that user should be aware of the reason for it and capable of resolving the issue his or herself.

Another variable to consider: encrypted and password-protected files. Even an extraordinary CASB can’t read these. However, it can still recognize them and thus provide security departments with the insights needed to create policies based on their existence. As always, the specifics of context, on user/departmental/data/compliance levels, should be taken into account in deciding how to handle such files.

There’s also CASB performance to consider. The timeframe the CASB requires to scan data will obviously depend on the volume of data and level of concurrent activity, but advanced CASBs should still be responsive even under intense workloads, and should try to scan data in a prioritized fashion that reflects commonplace business requirements. For instance, new data will typically be scanned first exactly because it’s new.

Try to develop a sense of how quickly the CASB responds under different conditions and workloads, and incorporate that insight into policy creation for best results. The more serious the consequences of a particular class of data leakage, the smaller the potential window of vulnerability should be.


Walk and run: Get great results, faster and faster

Once you have policies you’re largely satisfied with, how can you squeeze even more value from DLP and CASB?

This is where the walk and run phases of policy creation come in — they’re all about getting things done efficiently.

Typically this happens via cross-app, cross-service integration. Suppose, for instance, that your CASB notices mass data transactions happening via an authorized user on Continent A, and that same user, operating on Continent B, is attempting to upload newly-encrypted files to a cloud service operating on Continent C (where your company does no business).

Clearly, this is a problematic situation requiring the fastest possible response. Toward that end, it would help if the CASB can naturally integrate with both an alerting service to notify relevant managers, as well as a ticket generation service to address the issue or even shut down the workflow automatically.

For even more acceleration, consider choosing a CASB which is so advanced it can automatically discover third-party apps, determine the relevant users/privileges, resources, and data classes, and in many cases integrate to make the best use of that information.

Many organizations today have hundreds or even thousands of third-party apps that leverage productivity apps like Google Workspace for services and data. It’s enormously beneficial for internal security teams when their DLP and CASB can integrate with those apps for the fastest possible recognition of, and response to, situations involving potential data loss/leakage.

Share this content on your favorite social network today!