How the Incident Response Lifecycle Changes for Cloud

Published 11/13/2021

Written by Nicole Krenz, Website Project Manager, CSA.

Incident Response (IR) is a critical facet of any information security system. Most organizations have some sort of IR plan to govern how they will investigate an attack, but as the cloud presents distinct differences in both access to forensic data and governance, organizations must consider how their IR processes will change.

In this blog, see how the four phases of the Incident Response Lifecycle are affected by the cloud.

Phase 1: Preparation

When preparing for cloud incident response, here are some major considerations:

SLAs and governance: Incidents using a public or hosted cloud provider require an understanding of service level agreements and coordination with the cloud provider. Depending on the relationship, you may not have direct points of contact and be limited to whatever is offered through standard support.
IaaS/PaaS vs. SaaS: In multitenant environments, how can data specific to the cloud be provided for investigation? Understand and document what data will be available in an incident for each major service.
“Cloud jump kit”: These are your tools needed to investigate in a remote location. Do you have the tools to collect logs and metadata from the cloud platform?
Architect the cloud environment: Ensure you have the proper configuration and architecture to support incident response for faster detection and investigation.

Phase 2: Detection and Analysis

Data sources for cloud incidents can be different from those used in incident response for traditional computing. However, cloud platform logs are not universally available. Ideally, they show all management-plane activity. In the case of a serious incident, providers may have other logs that are not normally available to customers. Where there are gaps, you can instrument the technology stack with your own logging. This works within instances, containers, and application code in order to gain telemetry important for the investigation.

Forensics and investigative support will need to adapt beyond understanding changes to data sources. There is a greater need to automate many of the forensic or investigation processes in cloud environments because of their dynamic and higher-velocity nature. You can automate tasks as well as leverage the capabilities of the cloud platform to determine the extent of the potential compromise.

Phase 3: Containment, Eradication and Recovery

First ensure that the cloud management plane/metastructure is free of an attacker. This involves invoking break-glass procedures to access the root or master credentials for the cloud account to ensure that attacker activity isn’t being masked or hidden from lower-level administrator accounts.

The cloud often provides a lot more flexibility in this phase of the response, especially for IaaS. Software-defined infrastructure allows you to quickly rebuild from scratch in a clean environment. For more isolated attacks, inherent cloud characteristics—such as auto-scale groups, API calls for changing virtual network or machine configurations, and snapshots—can speed quarantine, eradication, and recovery processes.

These capabilities are not universal: With SaaS and some PaaS you may be very limited and will thus need to rely on the cloud provider.

Phase 4: Post-Mortem

Work with the internal response team and provider to figure out what worked and what didn’t, then pinpoint any areas for improvement. Pay particular attention to the limitations in the data collected and figure out how to address the issues moving forward.

Recommendations

Setting expectations around what the customer versus the provider does is an important aspect of incident response for cloud-based resources.
Cloud customers must set up proper communication paths with the provider.
Cloud customers must understand the content and format of data that the cloud provider will supply.
Cloud customers should embrace continuous and serverless monitoring of cloud-based resources to detect issues earlier.
Cloud-based applications should leverage automation and orchestration to streamline and accelerate the response.
For each cloud service provider, the approach to detecting and handling incidents must be planned and described in the enterprise incident response plan.
The SLA with each cloud service provider must guarantee support for the incident handling required for the effective execution of the enterprise incident response plan.
Testing will be conducted at least annually or whenever there are significant changes to the application architecture.

For a comprehensive look at incident response in the cloud, refer to CSA’s Cloud Incident Response Framework. To learn more about other domains of cloud security, check out our Security Guidance for Critical Areas of Focus in Cloud Computing.

Cloud Incident Response Security Guidance