Predictive Analytics and Machine Learning in Cybersecurity: an Untapped Opportunity for ‘Negative’ Response Time
This blog was originally published by CXO REvolutionaries here.
Written by Brad Moldenhauer, CISO - Americas, Zscaler.
The chief information security officer (CISO) is measured by his or her ability to reduce risk, control cost, and minimize friction among employees, data, and the business at large. The increasingly volatile threat landscape makes these objectives more difficult to achieve successfully, particularly risk reduction. And while there are many types of risk, the immediate threat is malicious actors compromising mission-critical data, applications, or services.
In the past, mitigating such risk required a suite of related, integrated technologies working in concert to recognize and block threats via logical policies. However, the increasing sophistication of threats – ranging from polymorphic malware to state-sponsored cybercriminals – requires a new, more proactive approach to address current and known threats and, to a significant extent, future threats as well.
To this end, artificial intelligence (AI) and machine learning (ML) capabilities offer promise and are already experiencing multiple and growing use cases in cybersecurity. When deployed in modern computational architecture such as a cloud, processing power, memory, storage, and network bandwidth can be dynamically allocated in proportion to changing workloads.
AI/ML in cybersecurity delivers powerful pattern recognition capabilities applicable to many business scenarios and threat classes. Once informed by analysis of a large body of training data, these tools often recognize known and unknown threats with accuracy comparable to trained security professionals. Transcending the limitations of that human professional, their performance scales in parallel with allocated technical resources and works continually.
One of the most powerful capabilities is predictive analytics, which detects patterns and then makes predictions by extrapolating forward in time to determine a likely outcome. This capability is precious in cybersecurity because the best response time to threats is negative. The host organization is informed of future threats, what should be done to stop them, and defensive actions are taken without human intervention.
Fulfilling the potential of AI/ML in security is complex, but not impossible
AI/ML-powered predictive analytics should, at least in theory, enable negative response times to threats. But the challenge of accurately predicting future threats is far from trivial, and some of today’s solutions fall short of delivering on their value proposition.
This is partly because of the sheer complexity of analyzing cyber threats. Consider that the most sophisticated ones are driven not by code, however cleverly programmed, but by experienced human specialists working in concert. Anticipating how such individuals will act, whether alone or part of a team, is largely beyond the scope of any AI/ML predictive analytics solution available today. This is why CISOs interested in managing risk will continue to need skilled teams of experts with deep domain knowledge.
Some other challenges in delivering predictive analytics solutions, however, are within the security solution provider’s power to address (again, at least in theory). These challenges predominantly revolve around their data to train their initial AI/ML models. Providers can use this data to refine and improve models, thus improving the accuracy of their predictions.
The best predictive analytics results require the right training data – and plenty of it
Where can organizations get pertinent data? That simple question is not so simple because the training data must meet several requirements. For instance:
- It must accurately reflect the production environments (IT infrastructures and all related resources) to be protected.
- It must include numerous instances of false positives that might fool less sophisticated security solutions. AI/ML models should be capable of recognizing and passing over false positives with a high degree of confidence.
- It must include genuine security threats and attacks, including information about the status and composition of critical resources like dynamic libraries, scripts, executables, and other elements both before and after the attack. It’s best to use successful attacks carried out against modern, real-world infrastructure.
- Above all, the total volume of data used to train the AI/ML – whether data about false positives, the infrastructure generally, or successful attacks – must be truly colossal. All AI/ML technologies require massive volumes of training data to deliver accurate results, and those pertaining to predictive analytics are no exception.
Sign up to receive CSA's latest blogs
This list receives 1-2 emails a month.