Deep learning is inspired by the brain’s ability to learn: once a brain learns to identify an object, its identification becomes second nature. Similarly, as a deep learning-based "artificial brain" learns to detect any type of cyber threat, its prediction capabilities become instinctive. As a result, the most evasive and unknown cyber-attacks could be immediately detected and prevented. This is the aim of Deep Instinct, bringing a new approach to cybersecurity by enabling cyber-attacks to be identified and blocked in real-time before any harm can occur.
At the Deep Learning Summit in San Francisco this past January, Eli David, CTO of Deep Instinct, presented the evolution of artificial intelligence, from old rule-based systems to classical machine learning models and state-of-the-art deep learning models, with a live demonstration that revealed the giant leap in accurately detecting new malware. I asked Eli some questions to learn more about his work and this thoughts on the deep learning field.
What are the key factors that have enabled recent advancements in deep learning?
A combination of both algorithmic and hardware improvements. Due to numerous recent advances in methods for understanding and training deep neural networks, we are capable of training deep networks with tens of hidden layers. Fast GPUs allow the training time for such networks to be reduced by many tens of times. The combination of these two enable us to train large neural nets with billions of synapses. I have been teaching neural networks for the past decade, and yet, until a few years ago I would have considered training such huge networks inconceivable…
How are you currently applying deep learning to cybersecurity?
The most significant improvement of deep learning over classical machine learning is that you no longer need to do feature engineering. If you would like to use machine learning for computer vision, you need image processing experts to tell you what are the few (tens or hundreds) of most important features in an image. But if you use deep learning for computer vision, you just feed in the raw pixels, without caring much for image processing or feature extraction, and of course you obtain 20-30% improvement in accuracy in most computer vision benchmarks. The same holds true for cybersecurity, and specifically for detection of new malware and APT (advanced persistent threats, the most advanced malware). Since signature based methods are completely incapable of detecting new malware, current solutions for detection of new malware rely mostly on manually tuned heuristics, and some more advanced solutions use manually selected features which are then fed into classical machine learning modules. Additionally, most current methods rely on running the malware in a sandbox environment, to obtain more information about it, thus allowing a more accurate detection. This comes at a cost of doing "detection" only, rather than "prevention", since sandboxing is typically a very time consuming process.
In Deep Instinct we are not trying to create any manual heuristics, and don't care about finding smart features either. Rather, in a very similar vein to how deep learning is applied to other domains, we feed datasets of many millions of malicious and legitimate files into our deep learning infrastructure (which is developed in-house in order to obtain an efficient infrastructure that can run deep networks even where limited resources are available, such as on mobile devices). That is, we rely on deep learning to learn for itself the useful high level non-linear features necessary for accurate classification. Additionally, we are looking only at the static level of files and do not use any sandboxing. This allows us to provide real-time detection and prevention (only a few milliseconds for prediction, even on slowest mobile devices), all the while running the deep learning module in prediction mode on the device (i.e., connectionless, without the need to send the suspected file to a cloud or appliance). Finally, due to the input agnostic nature of deep learning, we are capable of detecting any malicious file type (EXE, DLL, PDF, DOC, Android APK, etc.), again, since we do not care about the specific structures of different file types.
What are the main problems being solved by the application of deep learning to cybersecurity?
Assume that I show you a clear image of a dog, and assume that I modify just a few percent of the pixels in the image. You will still easily recognize that there is a dog in the image. Well, not if you were a cybersecurity solution. There are many hundreds of thousands of new malware introduced every day, and nearly all of them are based on extremely small mutations over the past known malware (by some estimates the vast majority of new malware are mutated by less than 2% in comparison to past malware). Yet, the current security solutions are completely incapable of detecting most of them. They look at the file, check its signature which doesn't match anything they know, and then the advanced solutions run it in sandboxing mode for many seconds or minutes, and use heuristics or even classical machine learning to get an idea of whether it is malicious or legitimate (and even then the detection rates are abysmal).
A vivid demonstration of this incapability of current solutions in dealing with new malware is a live demonstration we usually present, where we take several well-known malicious files and allow the leading cybersecurity solutions to provide their predictions on them. Most of the solutions detect all of the files (as they had ample time to manually sign them). Next, we run an automatic mutator we have developed which just modifies some non-functional parts of the malicious files to create new mutations, which retain the same malicious behaviour of the original files (the mutator changes things such as strings, etc, and not any functional part of the program). Now running the other solutions again on the mutated set of malware, they are mostly not capable of detecting even a single one of them.
Deep learning creates high level invariant representations of objects and is resilient to changes in the raw data (e.g., the high level representation of an elephant would remain largely unchanged even if the raw pixels of the image contain completely different views of the elephant). Using the same principle in cybersecurity, we are training deep networks on datasets of millions of files such that invariant representations are learned. Thus, when in real-world our deep net is given a mutated or modified file, it easily recognizes its true nature (malicious or legitimate). To further encourage this invariant property, during training time we intentionally modify the dataset in each epoch (introduce tens of percentages of mutations to each file). This makes the training phase much more difficult of course, but when completed, the resulting trained network is highly resilient to changes and mutations and detects nearly all new malware variants. On a typical benchmark of new malware and advanced APT, Deep Instinct obtains around 99% detection rate.
We constantly compare our results with those of over sixty different cyber security solutions, including some that use advanced heuristics and classical machine learning. On these new malware benchmarks, the best solutions after us obtain below 80% detection, with most of them yielding a 20-40% detection rate. This 20% detection rate increase we observe in cybersecurity is fairly consistent with improvements achieved by deep learning in other fields (e.g., in ImageNet the classification error dropped from 25% without deep learning to under 4% using deep learning).
Another major advantage of applying deep learning is the real-time prediction it provides, with just a few milliseconds to feed in a raw file and pass it through the deep neural network to obtain the prediction. This allows us to provide not only detection but also prevention in all cases (the moment a malicious file is detected, it is already removed as well). Our brain works in a similar way as well; it takes us a long time to learn something, but when we do learn it, we can use it very quickly in prediction mode. Other cybersecurity solutions which attempt to detect new malware usually rely on heavy sandboxing, which often provides only a post-mortem detection. As a side note, this also explains the name of our company, Deep Instinct, since we rely on deep learning and the reaction times are always real-time, like an instinct.
What developments can we expect to see in deep learning and cybersecurity in the next 5 years?
In the past two years we are observing an accelerated success of deep learning in most areas it is applied to. Even if we don't achieve the Holy Grail of human level cognition within the next five years (though this will most probably happen in our lifetimes), we will see huge improvements in many additional domains. Specifically, I think the most promising area will be unsupervised learning, as most of the data in the world is unlabeled, and our own brain's neocortex is primarily a very good unsupervised learning box. While Deep Instinct is the first company using deep learning for cybersecurity, I would expect more companies to employ it in the upcoming years. However, the barrier of entry to deep learning is still quite high, especially for cybersecurity companies which are not typically using AI methods (e.g., only a few solutions use classical machine learning), so it will take a few more years until deep learning becomes a commodity technology of widespread use within cybersecurity.
What advancements excite you most in the field?
I have always been fascinated by how our brain works, and the amazing feats it is capable of accomplishing. I believe that our brain is a Turing machine (a view shared by most AI researchers), and so there is no theoretical limitation to creating an artificial version of it which is as good or much better than the carbon based version. Deep learning is bringing us closer to this goal at a great and accelerating pace, and I foresee many exciting breakthroughs in the upcoming years, especially in unsupervised learning. While deep learning has successfully been applied to computer vision, speech, and text understanding, there are many other challenging domains which deep learning can potentially revolutionize. Based on this belief we founded Deep Instinct, and the results we obtained have vindicated it. I expect many additional areas to be revolutionized by deep learning, especially fields which involve large amounts of complex data (e.g., finance and medicine are two prime examples for such domains).
The next Deep Learning Summit in San Francisco will take place on 26-27 January, alongside the Virtual Assistant Summit. Super Early Bird tickets end on Friday 17 June - book now to save $600! Visit the event page here.We are holding summits focused on AI, Deep Learning and Machine Intelligence in Berlin, London, Singapore, New York, San Francisco and Boston - see all upcoming summits here.