The Big Guide to Data Security Posture Management (DSPM)

Published 03/31/2023

Originally published by Dig Security.

Written by Sharon Farber.

DSPM is a crucial piece of your cloud security puzzle. Learn what it is, why it matters, and how to choose the best solution to protect your sensitive data while growing your business.

What is DSPM?

Data security posture management (DSPM) is a set of practices and technologies used to assess, monitor, and reduce the risk related to data residing in cloud data stores – with a focus on multi-cloud environments. DSPM is data-centric, in that it looks at the context and content of the data being protected, placing the focus on sensitive records such as PII or medical records.

How Does DSPM Work?

While DSPM is gaining traction and recognition in the industry - including in a report by Gartner in 2022 - it is still an emerging technology. Hence, there is still some ambiguity in the way different vendors and analysts describe DSPM. In most cases, a DSPM software solution would include the following key capabilities:

1. Data discovery – identifying where sensitive data is stored in cloud environments

DSPM tools provide visibility into your cloud data inventory – the various services where sensitive data is stored across IaaS, PaaS, and DBaaS deployments. This could include managed cloud warehouses such as Amazon Redshift, Google BigQuery, or Snowflake; unmanaged or semi-managed databases running on virtual machines; as well as object storage such as Amazon S3, Google Cloud Storage, or Azure Blob.

Object stores can pose significant risks due to their unstructured nature and the tendency to use them for backups, landing zones, replications, and raw data storage. A company might store both its public web assets and its most confidential customer information in cloud storage; it’s easy to see how misconfiguration or human error can cause a mix-up between the two. Virtual machines pose their own set of problems, as security teams might be completely unaware that they are being used to store sensitive data.

DSPM addresses this by identifying every data asset in the cloud account, and regularly scanning the content of the data in search of sensitive records. This maps the way sensitive data is being stored and processed, and provides the basis for policy enforcement and alerting.

2. Classifying sensitive data to prioritize risks

There are different types of sensitive data, each posing a different level of risk and warranting a specific response. An organization might store IP addresses, PII data, credit card details, and access keys. None of these should fall into the wrong hands, but some pose a larger threat than others.

DSPM tools automatically classify each dataset in the cloud account(s), allowing security teams to prioritize policies and incident response on the most critical data assets. By prioritizing the assets that contain the highest-risk data, organizations can more effectively manage their data security posture and ensure that the appropriate security controls are in place according to the context of the data, as well as understand where an incident requires an immediate response.

For example, a dataset containing personally identifiable information (PII) related to specific named customers might be prioritized over a dataset containing aggregated, anonymized user data. If the security team identifies suspicious data flows related to the first type of asset, they would treat it as a high-priority issue to address; whereas if it’s the latter, it might not be as urgent.

3. Static risk analysis related to sensitive data

Once the sensitive data has been detected and classified, DSPM tools help to enforce practices meant to enhance the overall security posture related to data access – such as permissions, encrypted storage, and user management.

Monitoring and managing static risk involves examining the various security configurations and access controls associated with data stores that hold sensitive information. DSPM solutions continuously assess the cloud environment for misconfigurations, improper access controls, and other vulnerabilities that can lead to data breaches or unauthorized access. By identifying and remediating these issues, organizations can significantly reduce the likelihood of a security incident and maintain a strong data security posture.

Using DSPM capabilities, security teams can audit and adjust user permissions, identify over-privileged accounts, and enforce role-based access controls (RBAC) to limit the potential attack surface. In addition, DSPM solutions can verify that data is encrypted both at rest and in transit, and that proper key management practices are in place to protect sensitive information from unauthorized access.

Why DSPM? The shift to data-centric security

With all the cybersecurity tools currently available, it makes sense to ask: why do you need a whole new category of tool, just to protect data? Why not rely on the existing cybersecurity stack? To answer this question, it’s important to understand both the usage patterns of data in the modern enterprise, as well as the unique risks associated with it.

Data breaches are one of the main areas keeping CISOs up at night. The combination of digital transformation processes, an increased appetite for data and analytics, and the proliferation of cloud data stores means every enterprise is storing more sensitive data than it can easily monitor or control.

Data has always been a prime target for hackers and criminals. However, ransomware attacks and other data breaches have increased in recent years, as have the associated costs. According to IBM, between 2021 and 2022, the total cost of a breach increased by 10%. At the same time, privacy and compliance requirements around sensitive data increase the overhead for DevOps and security teams. And so, Engineering organizations, already overstretched, must now tackle an increasingly challenging data security landscape.

Enterprises are realizing that sensitive data is putting them at risk, and existing solutions are not keeping up with the rapid adoption of cloud data infrastructure. This has given rise to the new practice of data security posture management (DSPM). DSPM addresses the core challenges that arise when sensitive data is stored across many cloud repositories. It provides organizations with a set of practical tools to discover and secure sensitive data. And it's designed for a reality where data lives in multiple clouds and dozens of services.

Permissions, Policies, and Endpoint Security Aren't Enough

Previous approaches to securing enterprise data were focused on securing network entry points (legacy solutions) or on managing permissions, tools, and user access (CSPM). However, neither of these approaches is sufficient for the cloud era:

No endpoint to secure: The majority of cloud data breaches never hit an endpoint. Attackers target services hosted on the public cloud, which might not be covered by enterprise VPN services. The perimeter has dissolved, and security efforts need to happen inside the cloud rather than at the entry point.
Impossible to track data usage after permissions have been granted: Due to the proliferation of data, the tendency to broaden access to datasets, and the way data is replicated for various analytic services, it's almost inevitable that sensitive data will end up where it doesn't belong. Permissions alone will struggle to cover every contingency, every 'quick fix' that becomes permanent, and every case where a developer accidentally pulls more data than they need, or forgets to delete a copy of the data sitting in a loosely-monitored S3 bucket.

DSPM offers a way around these limitations by scanning the data itself, in the cloud repositories where it's stored. This allows for more proactive monitoring, including shadow data that is being generated on the cloud or moved from one cloud service to another. Importantly, DSPM tools work regardless of whether the original access to the data was authorized.

The Business Logic Behind DSPM

The technical challenge of keeping cloud data secure can create financial risk for organizations that store sensitive data. Almost every company falls into this definition, but larger enterprises are at higher risk. They store more customer data and face more severe reputational and financial harms in case the data is compromised.

And the threat is far from theoretical. 2,690 ransomware attacks were reported in 2021 - a 92.7% rise from the previous year. On average, the cost of a data breach for enterprises was $4.35 million, according to IBM Security. The foremost reason to improve data security is to prevent a data breach from occurring – and to reduce the amount of data that is exposed if one does occur.

Additional drivers for increasing scrutiny around data assets include:

Compliance: Most businesses are impacted by some kind of regulation around data security. This could be data privacy laws such as GDPR and CCPA, legislation related to medical data such as HIPAA, or standards such as SOC 2 which can have a material impact on a company's ability to do business with certain entities. Complying with regulatory requirements, or collecting evidence in order to achieve compliance, will often require an organization to have a clear inventory of its sensitive data.
Mergers and acquisitions / divestitures: During the process of buying or selling new companies, businesses need to have a clear picture of the data they hold. This can be as part of due diligence and risk assessment processes; unsecure data might be a large enough risk to affect the buyer’s decision. On the other hand, the ability to monetize data without risking a privacy or security mishap can affect the price of the transaction.
Cost efficiencies: Improving data security posture can reduce costs in multiple layers – both in terms of insurance against incidents such as ransomware attacks, as well as in savings driven by automation of manual processes such as policy checks, data classification, or periodic sampling and scanning of stored data.

DSPM and the broader cloud security landscape

While it’s clear that DSPM has a unique role to play, it’s still important to understand where it falls in the overall cybersecurity ecosystem, where it overlaps with other technologies, and what is unique about it:

DSPM vs CSPM

How does DSPM differ from cloud security posture management (CSPM) – which, until recently, was seen as the prevailing approach for protecting cloud data assets?

CSPM solutions focus on protecting the infrastructure itself, rather than the actual data. CSPM policies are geared towards reviewing data replication rules, fine-tuning access control, or finding weaknesses in cloud infrastructure or design – without scanning the data itself.

DSPM looks beyond the policy level at the content of the data. By scanning and classifying enterprise data, it allows an organization to see the true picture of where sensitive data is located and how it is being utilized. It also helps prioritize the long list of discovered issues and prevents alert fatigue (which can lead to important issues being ignored).

Read more about the differences between DSPM and CSPM.

DSPM vs other cloud security tools

We've talked about the differences between DSPM and CSPM. But how does it compare to other types of cloud data security solutions, and where would it fit in an enterprise's overall cybersecurity suite?

Data loss prevention (DLP) tools were designed for data exfiltration from the endpoint and are irrelevant in the cloud as data breaches in cloud don't reach the endpoint. DSPM tools that also offer real-time (DDR) capabilities can be seen as a form of cloud DLP.
Cloud access security brokers (CASBs) help enforce security policies between data consumers and SaaS applications. However, they do not cover data after it is stored on IaaS, PaaS or DBaaS – which is where DSPM comes in.
Native solutions offered by public cloud vendors (AWS, Azure, Google) do not support multi-cloud environments and are often limited in coverage and functionality (for example, only covering one type of service or database). DSPM provides holistic coverage, including in multi-cloud environments.

Advantages of DSPM compared to other approaches

Better visibility into where sensitive data lives: DSPM solutions scan cloud data repositories, discover sensitive data, and classify it. This creates an accurate map and inventory of the organization's data assets. It helps to understand where sensitive data is stored, who is accessing the data, and where it is going.
Identify data risks: Static risk analysis identifies data that is not fully protected and prevents misuse of data assets. The types of checks performed here include ensuring that data is encrypted and that logging is enabled in any situation where sensitive data is being accessed.
Policy controls: DSPM solutions provide a policy engine that is supported by a deep data threat model. They can detect real time risks when they appear, allowing for immediate remediation to stop a potential breach.

Why DSPM is Dominating the Conversation in 2023

In recent years, many new DSPM vendors have emerged to address the evolving needs of enterprise data security. Let’s try to understand what’s changed in recent years, and why data-centric security has climbed to the top of the CISO agenda. (We’ve also covered this in our predictions for cloud data security in 2023).

The public cloud has changed the way organizations work with data. Organizations no longer rely on monolithic databases or DevOps platforms; instead, developers leverage the cloud's elasticity to adopt a wide range of tools and microservices. These new paradigms give product and analytics teams more space to innovate and iterate quickly. At the same time, they create many potential risks when it comes to sensitive data.

Let's look a bit closer at three related trends, and see how they impact data security.

1. The Breakdown of the Enterprise Data Warehouse

‍

Data used to reside in a single enterprise-wide data warehouse (EDW), such as the ones provided by Oracle or Teradata. Security teams had a well-defined attack surface to worry about: protecting data meant securing the data warehouse. Access to data was often through DBA teams, which could maintain strict oversight.

This is no longer the case. Enterprises are leaning towards democratizing data, expanding access to it, and using a variety of best-in-breed tools to tackle specific data challenges. This makes teams more data-driven, but also means the attack surface sprawls into dozens or hundreds of potential data stores.

Today, very few large organizations rely on a single EDW. The elasticity of the cloud makes it easy to spin up new services and retain larger amounts of raw data. It is much more common to see cloud architectures using lower-cost object storage (such as Amazon S3) to store raw data, which can then be processed in a wide range of databases or analytic services to satisfy various use cases in the organization.

‍For example, a financial services organization might be storing the raw transaction log in object storage, copying a subset of the data to Snowflake for analytics purposes, moving some logs into Elasticsearch for application troubleshooting, and giving data science teams access to the raw data to run Spark ETL and machine learning jobs. For each use case, data is copied and moved, adding another potential location where sensitive data might end up.

2. Microservice-Based Development

It's not just storage that has become distributed. Modern software engineering also favors breaking apart monolithic applications into microservices – smaller applications or pieces of code, which communicate via APIs. This is aided by containerized application development, which allows developers to deploy new environments in a few clicks.

Microservices give developers flexibility and free them from overreliance on DevOps. However, each microservice has data assets assigned to it, leading to a further proliferation of data copies with minimal oversight.

A great deal of modern engineering work is around processing or analyzing data. It's almost inevitable developers will move or replicate sensitive data in the process of writing new code.

3. Multi-Cloud Architectures

The previous challenges are inherent to the way organizations use cloud infrastructure. The adoption of multi-cloud environments exacerbates them. The ease of moving data between services leads organizations to adopt tools from different cloud providers – again, in order to solve a specific data problem. For example, the same dataset might find itself in Amazon Aurora and Azure Synapse due to different teams needing to run a different SQL query, or in order to optimize costs.

As data moves between clouds, tracking lineage and classification becomes even more challenging. Native tools offered by the public cloud providers are limited to that specific cloud. Sensitive data has even more possibilities to seep into unmonitored corners. In these circumstances, creating effective oversight can be extremely difficult.

Improving DSPM solutions with dynamic monitoring

The DSPM capabilities we mentioned so far were mostly focused on static risk - finding sensitive data, classifying it, and reviewing the access controls and configurations that are applied to it.

However, to maintain an effective data security posture, it is essential to continually monitor and analyze data access patterns and user behavior. This is where data detection and response (DDR) comes into play - providing real-time monitoring and alerting capabilities to help security teams quickly detect and respond to potential threats or suspicious activities, while prioritizing the ones that put sensitive data at risk. By leveraging machine learning algorithms and advanced log analytics, these tools can identify anomalies in user behavior or access patterns that could indicate a compromised account or an insider threat.

For instance, if a user suddenly downloads a large volume of sensitive data or accesses resources outside of their typical work hours, the DDR tool can generate an alert for the security team to investigate further. This proactive approach to monitoring and anomaly detection helps organizations stay ahead of potential security incidents, minimizing the impact of breaches and ensuring that sensitive data remains protected.

By incorporating the key capabilities of data discovery, classification, static risk management, and continuous monitoring, a comprehensive DSPM solution can provide organizations with the visibility and control necessary to effectively manage their data security posture in multi-cloud environments. By choosing a DSPM solution that aligns with their unique requirements and risk tolerance, businesses can confidently secure their sensitive data while continuing to grow and innovate in the cloud.

Big Data Enhancing cloud security strategy