Cloud 101CircleEventsBlog
Join AT&T Cybersecurity in Chicago to learn top 2024 resilience tactics on May 21st!

Cloud Data Access – From Chaos to Governance

Cloud Data Access – From Chaos to Governance

Blog Article Published: 08/16/2023

Originally published by Dig Security.

Written by Benny Rofman.

Controlling access to sensitive data is the bread and butter of any cybersecurity strategy. However, the cloud adds many complications on the road to least privilege. Below we delve into the realities of data access governance in today’s multi-cloud architectures, why IAM isn’t enough, and how a DSPM-based solution can streamline operations and help you focus on protecting the data that matters.


What is Data Access Governance?

Data access governance (DAG) is the process of implementing policies, procedures, and controls to manage access to organizational data. When correctly implemented, DAG ensures that only authorized users and systems can access, manipulate, or share sensitive information, in accordance with data security and compliance requirements.


How Cloud Architecture Challenges Centralized Access Governance Models

The principle of least privilege states that any process, program, or user must only be able to access the resources and information necessary for their legitimate purpose. In the context of data, entities should be granted the minimum permissions to view or modify records, and which are required to perform their tasks. For example, an analyst looking at customer purchase patterns does not need access to user email addresses if they can rely on anonymized user IDs.

When enterprises stored their data in monolithic, on-premise data warehouse architectures, this was easier to achieve. The data platforms were tightly ruled by sysadmins. To prevent unauthorized access, they could rely on a combination of network security solutions (firewall and enterprise VPN), role-based access control (RBAC) at the database level, and agent-based monitoring of user activity and audit trails.

cloud architecture

The cloud removed the hard coupling between data systems and IT, and made business teams much more agile. However, the drive to become more data-driven means many more individuals and systems need access to data. And while the basic approach of least privilege hasn’t changed, it gets very difficult to implement when there are hundreds of users and dozens of roles, all of who legitimately need some level of access to some resources.

Challenges include:

  • Decentralization: Cloud infrastructure is elastic and easy to provision. Different business units manage their own set of resources, which may or may not share a common governance layer. Consistent access control policies are difficult to enforce.
  • Multiple access pathways and complex dependencies: Cloud services can be exposed via multiple interfaces and APIs, and often form an intricate web of dependencies. Deciphering which permissions are legitimately needed requires significant technical effort.
  • IAM complexity across multiple vendors: Organizations deal with multiple cloud vendors – each with its own set of user identities, authentication mechanisms, and access policies.
  • Fragmented, API-based architectures: Businesses are gravitating towards data lakes and the ‘modern data stack’, which means building an entire ecosystem of API-connected tools for data ingestion, integration, storage, and compute. Many of these tools can potentially access sensitive data, piling on additional requirements for permissions management.

state of data engineering 2023

Source: LakeFS: The State of Data Engineering 2023

Let’s look at a few examples to understand how these challenges can quickly become major security issues.


3 Common Risk Scenarios Related to Data Access Governance

  1. Lack of accountability: Lack of proper oversight can result in unauthorized or over-privileged access going unnoticed. Without a clear audit trail of user activity and permissions granted, security incidents become much more difficult to triage and investigate.
    An example: A large enterprise discovers its customer details have been leaked to the dark web. Due to complicated cross-account permissions, it takes weeks to determine that a sales engineer’s account was compromised and responsible for the breach — preventing the company from identifying other compromised resources.
  2. Compliance violations: Regulatory and industry frameworks such as GDPR and PCI-DSS explicitly require certain controls related to data access. Having a clear picture of who has access to which dataset can also play a part in evidence collection ahead of an audit.
    An example: A financial services company is subject to PCI-DSS compliance, which requires strict control over access to payment card information. Due to mishandled permissions, a developer copies the cardholder data to a non-compliant staging environment. The violation is discovered during a routine audit.
  3. Data exfiltration: Uncontrolled access to sensitive data significantly increases the risk of data loss, especially insider threats or compromised credentials. This can lead to extreme unpleasantness such as ransomware attacks or customer data leaks.
    An example: A contractor needs temporary access to a database containing customer details, but their access is not revoked after the project is done. This goes unnoticed until six months later, when the contractor downloads the database and sells it to a competitor.

Why IAM and CIEM Won’t Solve Your Cloud Data Access Problems

At this point you might be thinking: this sounds like a permissions issue. Wouldn’t you solve those with an IAM tool? Unfortunately, trying to prevent unauthorized access on the configuration level often ends up as a game of whack-a-mole, bogging down security teams with dozens of daily alerts and no real way to prioritize incidents.

Identity and access management (IAM) tools are used to manage users, permissions, and access controls within an organization. On a practical level, IAM is infamously complex and difficult to manage. But in any case, it’s not built to solve data access issues. IAM solutions treat data stores as just another cloud resource; they are not aware of which data is stored, how data flows between services, and more granular permission levels (such as to certain datasets within a database).

Cloud infrastructure entitlement management (CIEM) is a more recent approach designed to simplify and consolidate IAM for cloud environments, and to reduce permissions sprawl. However, CIEM is also not data-centric, and not designed to address the basic challenge of understanding and prioritizing sensitive data.

What’s important to understand is that data access challenges are not a normal permissions issue:

  1. Data stores have unique access patterns. They typically require broader and more fluid access than other cloud resources. Much more so than other cloud resources, with data stores you will find many people have a legitimate purpose for wanting access.
  2. Not all data is created equal. While any data exposure is undesirable, the risks associated with sensitive data (PII, PCI, etc.) are far greater. Businesses would see most benefits from a targeted approach that focuses on preventing unauthorized access to these data assets.
  3. High stakes. Sensitive data is the most common target for modern cyber attacks - and when it is compromised, the financial and reputational consequences can be dire.

Hence, modern enterprises are looking beyond IAM and towards a data-centric approach to cloud data access.


The DSPM Solution to DAG

What does a data-centric solution to DAG look like? The answer starts with data security posture management (DSPM).

DSPM is a set of practices and technologies used to discover, classify, and prioritize sensitive data stored in cloud environments. By building on the insights provided by DSPM, businesses can gain immediate visibility into who has access to which cloud data store, and prioritize the issues that put sensitive data at risk.

Start with an inventory of your sensitive data. Before you can prioritize and implement effective data access governance, you need to know what data you hold and where it resides. DSPM tools provide a comprehensive inventory of your sensitive data, which enables you to identify the most critical assets that require a higher level of attention.

Visualize and map data access: DSPM tools can show a visual mapping of data access in your organization, which helps you understand the relationships between users, roles, resources, and data stores.

Prioritize high-risk issues: Once you have a clear picture of your sensitive data assets and can easily understand who has access to what, you can identify and prioritize issues that need your immediate attention. For example, you might want to look at:

  • Cross-account access to sensitive data assets: There are occasions where an external account needs access to a cloud resource, such as an ETL tool reading data from Azure Blob Storage. However, many exfiltration and compliance issues start with permissions that are kept active despite no longer being needed. DSPM can help you keep track of cross-account permissions and continuously question whether a specific permission is needed.
  • Non-SSO users with access to sensitive data: Users who are not authenticated via single sign-on (SSO) pose a risk as their access may be more difficult to manage and secure. DSPM lets you identify users with hard-coded credentials who have access to sensitive data. You can then verify that this access is justifiable and monitored.
  • Multiple permission paths to a dataset: Complex permission architectures can result in multiple permission paths to a single dataset, including by the same principal. This increases the likelihood of overprivileged or unauthorized access: an admin might remove privileges granted through one account, without being aware of the other. Identifying and consolidating access controls reduces risks and simplifies your permissions architecture.

.

DAG is Necessary but not Sufficient for Comprehensive Data Security

Adopting a data-centric approach to DAG should be part of a broader cloud data security strategy – which also includes data detection and response for real-time monitoring, malware analysis, and cloud DLP.

Share this content on your favorite social network today!