Peter Sadowski is a PhD student at the University of California Irvine, where he studies deep learning and artificial neural networks. He has published work on stochastic algorithms for training neural networks, along with work on deep learning applications in diverse areas such as bioinformatics and high-energy physics. More generally, Peter is interested in data-driven solutions to problems of learning, inference, and optimization.
We took a few moments to catch up with Peter ahead of his presentation at the Deep Learning Summit, where he will discuss Deep Learning in High-Energy Physics.
What are the key factors that have enabled recent advancements in deep learning progress? The deep learning models and algorithms were invented decades ago. The recent advancements have primarily come from increases in computing power and the development of some practical tricks that make it easier to train deep artificial neural networks. Some of these tricks (the dropout algorithm, rectifier units) are not fully understood yet, and only add to the skepticism that people have toward neural networks as a 'black box' method. But neural nets are very good at learning, and the recent success stories have gotten peoples' attention.
What are the main types of problems now being addressed in the Neural Network space? Neural network architectures can be specialized for structured input, such as images, sequences, or graph structures. The architectures that have been developed for computer vision and speech recognition have been very successful. However, there are many other deep learning applications that could benefit from specialized architectures, such as protein structure prediction and chemoinformatics. The challenge is to design architectures that are both expressive and fast to train.
Another important problem is to learn useful representations of data with deep learning, an example of this being Google's word2vec tool. Autoencoder neural networks were an early attempt at this, in which a neural network automatically compresses data to a low dimensional space (essentially a non-linear, deep version of Principle Component Analysis), but the representation learning problem has recently received a lot more attention.
What are the practical applications of your work and what sectors are most likely to be affected?I've developed techniques for training deep neural networks and demonstrated the advantages of deep learning over shallow machine learning methods. I hope my work inspires others to use deep neural networks for their machine learning applications, both for supervised learning and representation learning. While most of the applications I've worked with have been in the natural sciences, the ideas transfer to other application domains.
What advancements excite you most in this field?I am particularly excited by recursive neural network models, which can be used for deep learning on sequence data. These models have myriad applications in natural language processing, time-series data, and bioinformatics. Recursive neural networks are appealing because they can efficiently represent very complex functions and the weight sharing helps with learning (as in a convolution neural network for images). The main challenge is to incorporate long-range interactions and figure out how to train your model quickly.
The Deep Learning Summit is taking place in San Francisco on 29-30 January. For more information and to register, go to: re-work.co/deep-learning