An Introduction to Data Detection and Response (DDR)
Published 03/20/2023
Originally published by Dig Security.
Written by Sharon Farber, Director of Product Marketing, Dig Security.
How long would it take you to respond to a cloud data breach?
For most organizations, the answer is ‘far too long’. According to a 2022 report by IBM, businesses took an average of 207 days to identify a breach and an additional 70 days to contain it. When it takes less than a minute for an authorized user to download an entire database’s worth of customer data, it’s clear that something is amiss.
The move to cloud infrastructure requires new approaches to data loss prevention (DLP). Monitoring data in real-time becomes much more complicated in the cloud; you can't install an agent on a database hosted by Amazon or Google, or place a proxy in front of thousands of data stores. The emerging solution is data detection and response (DDR), which is often seen as the new cloud DLP – a means to introduce dynamic monitoring to cloud environments.
In this guide, we'll cover the basics of DDR: why it's needed, how it compares to other solutions, and how you can get started.
What Is Data Detection and Response (DDR)?
DDR describes a set of technology-enabled solutions used to secure cloud data from exfiltration. DDR focuses on real-time monitoring, detection, and response. It provides dynamic monitoring on top of the static defense layers provided by CSPM and DSPM tools.
How DDR Solutions Work:
(We'll be referencing DSPM throughout this section - if you're not sure what that means, please check out our Big Guide to DSPM first.)
In a nutshell: DDR solutions use real-time log analytics to monitor cloud environments that store data, and detect data risks as soon as they occur.
In more detail:
- Today's organizations store data across a wide variety of cloud environments. For a midsize or larger enterprise, data will be found in multiple PaaS (e.g., Amazon RDS), IaaS (virtual machines running data stores), and DBaaS (e.g., Snowflake) tools.
- Monitoring every data action would not be feasible in the age of big data. DDR solutions would include DSPM capabilities to lay the groundwork of discovering and classifying data assets. The DSPM process would then highlight risks detected with the data assets such as sensitive data not encrypted, or a programmatic data flow violating data sovereignty rules. These risks would be prioritized by DSPM and then remediated by the relevant data owners or IT.
- Once sensitive data assets have been mapped, the DDR solution starts monitoring activity related to these assets. This is done via the cloud-native logging available in every public cloud – the cloud provider will generate an event log for every query or read request.
- The DDR tool will parse the log in near-real time and apply a threat model to identify suspicious activity, such as data flowing to an external account.
- If this is a new risk, the DDR will issue an alert and suggest the best response. These are generally urgent and need to be acted upon immediately.
- DDR alerts are often consumed directly in an SOC/SOAR solution to better fit into the existing business operations and for faster resolution time.
Policy Examples - How DDR Is Used in Practice
To better understand the types of incidents which a DDR solution would detect, let’s look at a few examples that we’ve seen from our customers – and which allow organizations to detect risky behaviors as well as actual exfiltration:
- Data sovereignty issues: Legislation from recent years creates obligations to store data in specific geographical areas (such as the EU or California). DDR can help detect when data is flowing to an unauthorized physical location, preventing compliance issues down the line.
- Assets moved to unencrypted / unsecure storage: As data flows between databases and cloud storage, it can easily ‘find itself’ in a data store that is not as secure as it should be (often as a temporary workaround that is then forgotten about and becomes permanent). DDR would alert this type of movement when it happens.
- Snapshots and shadow backups: Teams are under increasing pressure to do more with data; this can lead to a lot of shadow analytics happening outside of regular, authorized workflows. DDR helps find copies of data that are being stored or shared in ways that can lead to a breach.
DDR vs CSPM and DSPM
How does DDR fit into the cloud data security landscape, and how does it differ from existing approaches?
- Cloud security posture management (CSPM) is about protecting the posture of the cloud infrastructure (such as overly-generous permissioning or misconfiguration). It doesn’t take the data into consideration – its context and how it flows across different cloud services.
- Data security posture management (DSPM) protects from the data outwards. DSPM adds a layer of data awareness: The DSPM tool would scan the actual data stored, detect assets that contain sensitive data (such as PII or access codes), classify the data, and assess the risk associated with it. This gives security teams a clearer picture of data risk and data flow, allowing them to prioritize the cloud assets where a breach would be most damaging.
While DSPM offers more granular and fine-tuned cloud data protection, both of these solutions are static and are focused on posture. They allow organizations to understand where the risk lies, but they offer little in terms of real-time incident response.
On the other hand, DDR is dynamic. It focuses on data events happening in real time, sending alerts, and giving security teams a chance to intervene before the damage is done (or while it is still minimal). It monitors the specific event level, versus other solutions that look at configurations and data at rest.
An Example Scenario:
An employee has access to a database containing customer data. This access is legitimately authorized due to the nature of the employee’s role at the company. However, when the employee decides to leave the company (and before notifying her managers of her intention to do so), she copies the entire database to her personal laptop in order to take it to the next company.
In this scenario, everything was fine permissions-wise, but the end result is a major exfiltration event. A well-calibrated threat model could detect that this export contained an unusual batch of data, or other irregularities in this event that should raise a red flag. The DDR tool would send an alert and provide full forensics – pinpointing the exact asset and actor involved in the exfiltration. This would save precious time and allow security teams to intervene before any real damage is done.
The Importance of the Threat Model in DDR
As the example above demonstrates, the threat model is a crucial component of a DDR solution.
It's not enough to be able to access and parse cloud logs in real-time. Every day, a massive volume of 'data events' (new data services onboarded, new data stores, backups and snapshots created) take place in an enterprise's cloud account. If the DDR tool cannot identify which of these pose an actual risk, it will either miss critical incidents or overwhelm security teams – who already suffer from notification overload – with false positives.
Threat models are developed by cybersecurity researchers, taking into account:
- Attack patterns revealed in previous data breaches
- The specific weaknesses in every data service, and how these can be exploited by bad actors
- The unique footprint a security incident leaves in the cloud logs
While many DDR solutions are likely to appear in the next few years, the true differentiator between them will be the quality and accuracy of the threat model that powers their real-time detection.
Why DDR? Technical and Business Benefits
There is no shortage of cybersecurity tools on the current CISO agenda. Is another type of tool really needed, or will it contribute to further tooling bloat?
As we will explain in the next section, DDR provides mission-critical functionality that is missing from the existing cloud security stack. When agents are not feasible, you need to monitor every activity that concerns your data. DDR protects your data from being exfiltrated or misused, as well as from violating a compliance rule. DDR helps reduce operational overhead by integrating with SIEM/SOAR solutions, so teams can consume all alerts in one place.
The Need for Agentless DLP
Monitoring data assets in real time might seem obvious when you consider that data is the #1 target for cyberattacks (such as ransomware). But the transition to the public cloud, and the differences between IaaS and PaaS compared to traditional on-premises environments, have left many organizations without an adequate way to protect sensitive data.
Before the cloud, work was done mainly on personal computers, connected via an intranet to a server. Security teams could monitor traffic and activity by installing agents (specialized software components such as antivirus tools) on every device and endpoint that had access to organizational data.
This is no longer the case. To quote the 2010s-era meme, "there is no cloud – it's just someone else's computer." And when you don't own the computer – the machine that is running your cloud database or Kubernetes environment – you can't install an agent on it. Data loss prevention (DLP) becomes much trickier.
Hence, the industry gravitated toward static solutions geared towards improving the security posture of cloud data stores (CSPM, DSPM). These tools would try to minimize the attack surface by detecting misconfigurations and exposed data assets. But these methods are insufficient to stop a determined attacker.
When Static Defense Layers Are Not Enough: Lessons From a Breach
The 2018 Imperva breach started with an attacker getting access to a snapshot of an Amazon RDS database containing sensitive data. The attacker used an AWS API key stolen from a misconfigured compute instance that was publicly accessible. Could a similar event have happened today?
While CSPM tools might have been able to identify the misconfiguration, and DSPM might have been able to detect the fact that there was sensitive data stored on the misconfigured instance, this example also highlights the limitations of these approaches. Once the attacker has access that 'looks' legitimate, they would not be able to identify the unusual behavior: exporting a snapshot of the database to an unknown device.
Indeed, the Imperva breach itself was only discovered ten months after the fact, via a third party. During this period, the company was not aware, and could not notify its users, that sensitive data had been leaked.
A DDR solution, which monitors the AWS account at the event log level, could have potentially identified such an attack in real time and alerted internal security teams – allowing them to respond immediately, rather than many months later.
Challenges Exacerbated By Data Tooling Sprawl, Microservices, and Multi-Cloud Environments
As organizations adopt the cloud ever more enthusiastically, the challenge of monitoring data assets becomes more pronounced. Rather than a single data warehouse or data platform, today's sensitive data can be found in dozens or hundreds of separate data stores.
Business and data teams are constantly on the lookout for new ways to drive profitability through analytics. This often means adopting many different tools and technologies – databases, serverless query engines, BI and data science platforms. On the dev side, microservice architecture splinters a codebase into dozens or hundreds of smaller services, each with an attached data asset.
The result of these processes is that data can move more freely than ever before, making it harder to keep track of who is accessing, altering, or exfiltrating data. Policies should prevent sensitive data from being copied without good reason, but the reality is inevitably messier. This is before mentioning hybrid and multi-cloud environments, which are growing increasingly popular and add significant complexity.
With data distributed across so many potential targets, the attack surface becomes very difficult to monitor. The cloud or service providers will offer some native solutions, but these will not cover tools offered by different vendors – especially when they are hosted in a different cloud.
A Single Threat Model Across Environments
DDR tools address this challenge by monitoring activity across cloud environments, as it is recorded in each cloud provider's centralized logging systems (for example, Amazon CloudTrail or Azure Monitor).
Here’s an example of such a log:
In this CloudTrail event, a malicious actor has made a snapshot of a sensitive RDS database public. This means that the snapshot, along with any sensitive data it contains, can now be restored in any AWS account. This could potentially expose sensitive data to unauthorized access.
DDR allows organizations to monitor all cloud activity from one place, rather than building ad-hoc solutions per data service or relying on a patchwork of security tools. Once a threat model is deployed, it can be applied to every environment where sensitive data is stored.
DDR replaces the labor and compute-intensive processes of collecting, parsing, and analyzing data from each database or VM. This helps security teams reduce their overhead and focus their efforts on managing strategic risk, rather than playing whack-a-mole with dozens or hundreds of potential vulnerabilities.
Getting Started With DDR
DDR is relatively new. As with any technology decision, it's important to understand how it would fit into your organization's way of doing things, and to map out the prerequisites for a successful implementation beforehand.
Here are a few pointers to look out for if you're thinking of investing in a DDR solution:
- Know where your sensitive data is: A strong DSPM foundation, which can identify and classify sensitive data assets, is essential. Monitoring every single query on every single data store is not feasible (unless you have infinite resources to spend). You want to narrow down the monitoring surface to the places where sensitive data is actually stored. This should be part of your DDR solution, rather than require separate tooling.
- Have a clear policy for dealing with incidents. DDR solutions will surface potential issues, but if you don't have a well-defined process for responding to them, you will only further overwhelm your security and engineering teams.
- Decide on an owner. Every enterprise has its own ways of dealing with data in general, and data security in particular. Make sure you know where DDR falls within your org chart. If it's with SOC teams, they need to understand data context and how to avoid production-breaking changes; if it's DataOps or DevOps, they need to know how to respond to a security incident.
Supporting Innovation Without Sacrificing Security
The cloud is here to stay, as are microservices and containers. As cybersecurity professionals, we can't prevent the organization from adopting technologies that accelerate innovation and give developers more flexibility. But we need to do everything we can to prevent the potential calamity of a data breach or ransomware attack.
DDR offers a critical aspect that was previously missing in the cloud security landscape: dynamic monitoring of complex and multi-cloud environments. Monitoring real-time data activity, in addition to static security posture, can help security teams catch incidents earlier, averting disastrous data loss or minimizing the damage it causes.
About the Author
Sharon Farber is the Director of Product Marketing at Dig Security and as such believes that good technology needs to be accompanied by simple words. A veteran in Cyber Security, Sharon has worked for several big software vendors including Computer Associates as well as small nimble start-ups. She has held a variety of positions, some more technical than others. Sharon holds a B.S degree in Computer Science and a Masters in Operations Research. Whenever she gets time, Sharon enjoys swimming in the Mediterranean.
Related Articles:
The Evolution of DevSecOps with AI
Published: 11/22/2024
It’s Time to Split the CISO Role if We Are to Save It
Published: 11/22/2024
Establishing an Always-Ready State with Continuous Controls Monitoring
Published: 11/21/2024
AI-Powered Cybersecurity: Safeguarding the Media Industry
Published: 11/20/2024