Deepali Arora, PhD is Cmd’s Manager of Data Science and Analytics. She will go deep on the topic of machine learning in cybersecurity on Friday, October 11, 2019 at Day of Shecurity in San Francisco in her talk “Cutting through the buzzwords / fluff of Machine Learning and applying it to your security program”. This blog post provides a taste of what she’ll cover there.
As an industry, security is behind others in applying machine learning effectively. But at the same point, it is also the industry with the biggest challenges. Why? Because security is extremely unpredictable. And without a clear understanding of what techniques a cybercriminal will use, it is difficult to build models to accurately learn and predict unknown outcomes.
Of course that doesn’t stop us from trying to figure it out. The best minds at top-tier companies are coming up with new applications for machine learning and artificial intelligence on a daily basis — but with limited success.
Every day it seems there are examples of companies pulling these advancements soon after release because they aren’t accurate enough. Recent examples include Amazon’s facial recognition matching 28 congress members to criminals, Microsoft’s chatbot that Twitter users turned racist in less than 24 hours, and Apple’s face ID defeated by 3D mask.
One of the biggest causes of failures with machine learning in security is starting with the algorithm. Educational programs teach students about different AI algorithms that can be applied in general. However, in security, applications of these algorithms become a bit tricky.
This is because in the security domain to determine which algorithm is appropriate, you must first have intimate knowledge of security data, which is highly non-stationary in nature. You have to understand the nature of the data that you have and then determine the best-fit algorithm. From there you proceed to look at the errors and continue to improve these models further based on the type of errors such as high bias or variance. Fundamentally, you need to have a deep understanding of the data to create an appropriate model.
Another cause of machine learning failures in security is not creating a robust enough model. The results must be statistically valid. Many times the excitement over the applications of machine learning and the potential value outweigh the statistical deviation in the results and new applications are released too early to be accurate. The focus has to remain on statistical robustness if we are to truly reap the advantages of machine learning.
The other important consideration for getting machine learning in security right is continuous optimization. This is not a “set it and forget” it exercise. Cybercriminals are ahead of security professionals in adopting new techniques. They are constantly evolving which means that security models need to constantly evolve alongside them. Keeping data scientists and security professionals focused on understanding the latest techniques and transferring that knowledge to maintaining and updating the machine learning model will be critical to applied success.
Due to the nature of the ever-evolving landscape in cybersecurity, the potential applications for machine learning are limitless. However, the most important ones for cybersecurity professionals to tackle in the near future are the Internet of Things (IoT) and detecting malicious activity prior to execution of an attack.
As IoT devices grow in popularity, we find our world becoming highly interconnected. Traditional security approaches will likely not be enough. With the number of pathways into an environment growing exponentially, it is more endpoints than can be secured. We will need to incorporate machine learning in order to handle the complexity in this space.
Another clear opportunity for machine learning is using it to detect malicious activity before an attack is executed. Security tools today typically start at detecting an attack and diving into incident response to try to minimize damage. The nirvana in security is to be able to recognize the reconnaissance activity that a cybercriminal is executing within an environment to be able to block them from doing any damage. Machine learning can be applied here, but we have to get deeper in our understanding of cybercriminal behavior.
The best way for companies to realize the full potential of machine learning in security is if a new way of working on machine learning is developed. Specifically, data scientists and security professionals need to collaborate. Each of these groups have unique expertise when applied to machine learning. But if they continue to work in silos, our industry will never overcome the challenges of applying machine learning to such an unpredictable environment.
My advice as someone who is trained in both data science and security is to combine these departments and sit your data scientists side-by-side with your security operations team. Train your data scientists in cybersecurity and vice versa. Help them to speak the same language and gain a deeper understanding of security data and how machine learning can be applied to catapult our protection from attacks. Only then will we be able to jump forward as an industry and start beating cybercriminals at their own game.
Grab your spot to see Deepali talk about how machine learning is evolving in security at the upcoming Day of Shecurity event in San Francisco on October 11, 2019.