How to Use Kubernetes Audit Logs to Identify Potential Security Issues
This blog was originally published by ARMO here.
Written by Amir Kaushansky, ARMO.
Audit logging involves recording transactions and system events, making it an invaluable tool for regulatory compliance, digital forensics, and information security. In a typical Kubernetes ecosystem, auditing involves providing chronological, activity-relevant records documenting events and actions in a cluster. Modern logging tools come with aggregation and analytical functionalities so that teams can use log data to mitigate security threats.
In this post, I’ll explore Kubernetes audit logging and its importance in cloud-native security. I’ll also cover best practices for maintaining secure audit files.
An audit trail is a time-stamped record of events and system changes that provides a comprehensive history of activities performed by users, workloads, and cluster services. Audit logs are crucial for Kubernetes security because they document activities, some might affect an application’s behavior, the time of operation, various component calls, and the users responsible for the tasks. By developing effective audit logging, you will be able to establish a foundation for accountability, security, and compliance.
While audit logging might be used for analysis and identifying trends over time, its use-case in monitoring a Kubernetes cluster platform performance and administering security is most commonly leveraged by organizations.
Audit logs fundamentally help in the following areas:
Audit logs records each activity that happens in the cluster. For each activity, it adds metadata such as IP address from which the action was created, user-agent, etc. Using the audit log and the metadata, there are solutions that can look for Indication of Attack (IoA) and define policies. For example, you can create a policy allowing changes to the production cluster only from the organization's approved IP addresses any action outside of this approved list will raise an alert.
Security Incident Response and Investigation
Audit logging provides deep insight into a cluster’s actions and events, so it’s easy to reconstruct a problem if there’s a security incident. Teams can utilize audit trails to understand why, when, and how components of a cluster underperformed during operations. By understanding the conditions that lead to a security incident, security professionals can create enhanced monitoring, damage assessment, and remediation strategies.
Organizations can use audit logs to stay in compliance with regulations, such as PCI DSS, SOC2, HIPAA, or GDPR. Since the audit trail serves as an official record of system activity, organizations can take necessary actions to remove gaps or share them with security researchers and auditors for deeper analysis. Some regulatory bodies also accept audit logs as proof of compliance.
Through forensic analysis and real-time alerts, log files help system administrators and security professionals identify malicious user actions and behavior. Audit trails also flag unusual user and bot activities in real-time, thereby helping with intrusion detection and unusual user behavior as they occur. There are solutions that use UEBA (User and Entity Behavior Analysis) in order to identify abnormal activity. For example, a new user is creating a lot of objects, the DevOps manager logs into the system from an abnormal location.
Kubernetes audit records are generated by the kube-apiserver component. Every client request generates an audit event, which is processed using an audit policy then written to the backend. Below is an outline of important fields covered in the audit log.
The audit log primarily records transactions between the Kubernetes API server and end-users. As a server processes client requests, it sends certain information to the log file, including:
- Source IP
- Time of the request
- Decision (allow/deny)
Audit logs capture important account activities and information, such as:
- Successful authentication attempts
- Failed login attempts
- Use of application privileges
- Changes to the account (e.g., deletion, creation, and privilege escalation)
- The audit log records the original request that the user was asking the API server to perform.
In Kubernetes, you need to pass the API server the –audit-policy-file flag in order for the audit policy to be enforced. Policy is an object that defines the rules of events to be logged and what data the records should include. Once an event is logged, Kubernetes compares its characteristics against the list of rules. A sample audit policy specification would look similar to the following:
apiVersion: audit.k8s.io/v1 kind: Policy omitStages: - "RequestReceived" rules: - level: RequestResponse resources: - group: "" resources: ["pods"] - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] - level: None resources: - group: "" resources: ["configmaps"] resourceNames: ["controller-leader"] - level: None users: ["system:kube-proxy"] verbs: ["watch"] resources: - group: "" resources: ["endpoints", "services"] - level: None userGroups: ["system:authenticated"] nonResourceURLs: - "/api*" - "/version" - level: Request resources: - group: "" resources: ["configmaps"]. namespaces: ["kube-system"] - level: Metadata resources: - group: "" resources: ["secrets", "configmaps"] - level: Request resources: - group: "" - group: "extensions" - level: Metadata omitStages: - "RequestReceived"
- None - don't log events that match this rule.
- Metadata - log request metadata (requesting user, timestamp, resource, verb, etc.) but not request or response body.
- Request - log event metadata and request body but not response body. This does not apply for non-resource requests.
- RequestResponse - log event metadata, request and response bodies. This does not apply for non-resource requests.
Kubernetes gives two options for saving the audit log:
- Webhook (sends to 3rd party using HTTP)
If you are saving the audit log to the local filesystem, you need to pass the following to the API server flags:
- --audit-log-path specifies the log file path that log backend uses to write audit events. Not specifying this flag disables log backend. - means standard out
- --audit-log-maxage defined the maximum number of days to retain old audit log files
- --audit-log-maxbackup defines the maximum number of audit log files to retain
- --audit-log-maxsize defines the maximum size in megabytes of the audit log file before it gets rotated
3rd party location
If you are sending the audit logs to a 3rd party system, you need to pass the following to the API server flags:
- --audit-webhook-config-file specifies the path to a file with a webhook configuration. The webhook configuration is effectively a specialized kubeconfig (see the K8s documentation for more details).
- --audit-webhook-initial-backoff specifies the amount of time to wait after the first failed request before retrying. Subsequent requests are retried with exponential backoff.
You can define that the K8s API server will buffer the audit logs before saving/streaming it, you can define the buffer size, the batch size, the time the API server will wait before unconditionally batch events in the queue, batches per second, and in case of 3rd party system, the throttling burst (number of batches generated at the same moment).
There might be a case, where your API server receives a lot of requests per second and needs to handle and save/transmit a large number of records. You don’t want to define the audit log configuration parameters and cause logs to disappear due to a burst of requests which the API server can’t handle. The API server supports metrics to measure how often this happens. You can use these metrics in order to set the parameters rightfully.
For more information read: https://kubernetes.io/docs/tasks/debug-application...
Logs are only helpful if they are secure and untampered. A Kubernetes audit log becomes less effective if the information it records can be deleted or altered. As logs are essentially JSON files, they are commonly susceptible to theft, alteration, or corruption. Some practices that organizations can embrace to protect log files include:
- Log file encryption
- Setting specific authorization requirements/permissions for log file access
- Exporting and backing up logs to external systems
- Access control for administrators
- Alerts for log deletion, shutdown, and alteration (monitoring the policy file)
- Journaling and archiving
Attackers target log files to keep their activities undetected. As a best practice, it’s important to record logs on a remote server making it harder for hackers to access. Use the webhook option to stream the audit logs records to a 3rd party solution that will not only store the records remotely, as required by some compliance frameworks but also will protect it., by adding security - policy, threat detection, abnormal activity detection, and incident response capabilities.
Kubernetes API server can audit all the requests it gets. Audit logging helps organizations implement visibility for these ecosystems, enabling regulatory compliance and security. You can use it as another security layer as it is not intrusive and does not affect the performance of your cluster and applications.
Sign up to receive CSA's latest blogs
This list receives 1-2 emails a month.