ChaptersEventsBlog
Register for The Case for Agentic Teammates webinar Oct 28 to learn how agentic AI transforms the SOC.

Top Ways to Find and Protect Sensitive Data in the Cloud

Published 06/06/2023

Top Ways to Find and Protect Sensitive Data in the Cloud

Originally published by Laminar.

Written by Michael Holburn, Solutions Engineer, Laminar.

Cloud data risk is more prevalent than ever. Laminar Labs scanned publicly facing cloud storage buckets and found personally identifiable information (PII) in 21% of these buckets – or one in five. Despite advancements in cloud infrastructure security, it’s clear that something is still amiss.

Cloud security solutions such as CSPM can detect publicly-exposed storage buckets but cannot actually identify the buckets’ content and take action to protect any sensitive data in these cloud stores.

In this post, we’ll cover some of the most common ways companies accidentally expose their sensitive data in the cloud, and how to remediate these issues by focusing on the data itself.

What are examples of sensitive data?

Most businesses interact with some form of PII. Examples of PII data include contact information, social security numbers, ID numbers, or any other forms of sensitive personal information. Some businesses also work with PHI (Personal Health Information), meaning they process and store confidential medical data such as prescriptions and diagnoses. Others work with PCI (payment card industry) data — confidential data pertaining to credit or debit cards.

Exposing PII, PHI, PCI, or any other type of confidential data can lead to enormous consequences. According to IBM’s Cost of a Data Breach study, the global average cost of a data breach is $4.35M — not to mention the priceless loss of reputation and subsequent customer attrition.

How sensitive cloud data gets exposed

A few common missteps lead to cloud data exposure:

Accidental exposure

It’s common for a team member to accidentally leave a data asset exposed to the internet. Thanks to the fast-paced nature of the cloud, major exposure can happen quickly and unintentionally. Laminar Labs’ study of public S3 buckets uncovered several examples of sensitive data exposed to the public, including

  • An asset containing PII from people who used a third-party chatbot service on different websites
  • An asset containing loan details – name, loan amount, credit score, interest rates, etc.
  • An asset with first names, last names, Ethereum address and Bitcoin address information, and block card email addresses

This type of exposure often happens because organizations lack proper cloud data governance. If there aren’t policies or enforcement vehicles to prevent improper data storage in the cloud, sensitive data will get exposed. After all, most team members simply don’t have the bandwidth to keep the security of this data top of mind, especially given this is not the primary directive of the users of the data. Their job is to innovate and drive an organization’s business. They accidentally store sensitive data in insecure locations, such as an S3 bucket that’s public by design. Realistically speaking, improper cloud data storage is inevitable.

Data movement

Even if sensitive data is properly secured at first, there’s always a risk that it could be moved or copied to an unsecured environment.

This problem is specific to the cloud because moving or copying information in an on-premise environment takes a lot of effort and requires extensive permission from gatekeepers. But in the cloud, virtually any staff member can copy or move data within seconds. Unregulated data movement leads to shadow data (i.e. unknown, unmanaged data) in the form of copied data, orphaned backups, unlisted embedded databases, or cached application logs.

Improper access management

This security misstep happens when extraneous users or third parties are granted access to sensitive data. For example, a user who has an overly-permissive role or policy might copy sensitive data into an alternate location within the organization without anyone’s knowledge. If they leave the company, they’ll leave this sensitive data in an unauthorized location, and nobody will know about it.

Or when it comes to third-party access, granting too much access to an external tool (e.g. a CI/CD tool or SaaS application) means that risk mitigation is completely out of your hands. If this third-party tool experiences risk, your data is also at risk.

3 best practices for protecting sensitive data in the cloud

1. Discover and classify all cloud data

You can only protect what you know. So to combat the shadow data problem, your organization needs to precisely identify and classify sensitive data. You can keep tabs on your data by instituting a data catalog. It’s also important to set up a continuous data discovery and classification method to keep up with the dynamic nature of the cloud.

2. Secure and control your cloud data

You also need to secure and control your sensitive cloud data. The best way to do this is by instituting security policies and continuously validating data with your company’s pre-determined guardrails. And as this data gets validated regularly, any instances of violations need to be prioritized for remediation based on sensitivity level, security posture, volume, and exposure.

3. Remediate and monitor without interrupting data flow

Lastly, you need to remediate sensitive data exposure without interrupting data flow.

This process enables your team to protect data in the cloud without compromising agility. Some of these remediation steps involve implementing data security best practices, such as enabling encryption. Other best next steps might be general data hygiene practices, such as removing unused sensitive data from your environment.

In addition, organizations need to set up measures for continuous data monitoring. Data security isn’t a one-and-done process. Instead, you must continuously monitor your crown jewels against stated security posture and regulations, regardless of where they move in the cloud.

How a DSPM helps

DSPM (data security posture management) is an emerging cloud data security strategy that makes best practices for data security in the cloud a reality. It does this by discovering all cloud data, classifying it by data type and sensitivity level, detecting and alerting on data security policy violations, prioritizing those alerts, and providing remediation playbooks.

Share this content on your favorite social network today!

Unlock Cloud Security Insights

Unlock Cloud Security Insights

Choose the CSA newsletters that match your interests:

Subscribe to our newsletter for the latest expert trends and updates