Beating the Perils of Non-Convexity in Neural Nets

Neural networks, or "neural nets", have become a common term in the tech world and discussions around artificial intelligence, since their use in machine learning has revolutionised performance across multiple domains like computer vision and speech recognition.However, training a neural network is a highly non-convex problem, and the conventional stochastic gradient descent can get stuck in spurious local optima. Hanie Sedghi, Research Scientist at the Allen Institute for Artificial Intelligence, proposes a computationally efficient method for training neural networks that also has guaranteed risk bounds. It is based on tensor decomposition which is guaranteed to converge to the globally optimal solution under mild conditions.Hanie will be speaking at the Machine Intelligence Summit in San Francisco, sharing expertise on how this framework can be leveraged to train feed-forward and recurrent neural networks with her presentations 'Beating Perils of Non-convexity: Guaranteed Training of Neural Networks Using Tensor Methods'. I asked her a few questions ahead of the summit this week to learn more.Please tell us a bit more about your work. My goal is to bond theory and practice in machine learning by designing algorithms with theoretical guarantees that also work well and efficiently in practice. In particular, I am interested in modelling machine learning problems as optimization frameworks and designing algorithms that have efficiency guarantees in high dimensions and also outperform existing methods in practice. Recently, my colleagues and I have proposed a line of work in non-convex optimization that is guaranteed to recover global optima for various applications such as training neural networks, learning mixtures of generalized linear models (GLMs), and, more recently, training recurrent neural networks (RNNs). Our framework incorporates generative models for discriminative learning. What do you feel are the leading factors enabling recent advancements and uptake of machine learning? The successful demonstration of powerful methods that give state-of-the-art results has attracted increased interest and funding from industry. In addition, easy access to more computational power and big data has played an essential role in enabling recent advances. What recent advancements have helped advance training in neural networks? There are several factors. Access to more computational power and massive data, which has also led to attracting more researchers and data scientists are the main contributors. These are hard to disentangle from advances in algorithms, models and architectures. Additionally, an open community culture, that includes the sharing of ideas, source code and open software libraries, has played an important role. What present or potential future applications of deep learning excite you most? There has been some progress in the area of natural language processing recently: neural nets can read a text and answer a span prediction question about it, e.g., Stanford Question Answering Dataset (SQuAD). However, we are far from natural language understanding and complex reasoning using background knowledge. I am very enthusiastic to see deep learning progress in this area and transform the field. We have seen great breakthroughs in machine vision and pattern recognition. I am excited to see progress in teaching machines common-sense knowledge through sensing the world. As an example, understanding video scenes to comprehend the process and predict what happens next is a direction I find very exciting. Which industries do you feel will be most disrupted by deep learning, and machine learning in general, in the future? Industries that require fast and automatic analysis of data such as web services have all been influenced by deep learning. As the field of machine learning advances, the influence expands to more industries; a famous example is self-driving cars. If the models used in machine learning become more interpretable by human experts, we can expect to see the impact in a much wider domain including medical diagnosis and law. What developments can we expect to see in machine intelligence in the next 5 years? I am hoping to see more theoretical analysis explaining how deep learning works as well as an understanding of the restrictions on and boundaries of current methods. This will also lead to designing better methods and improve the current approaches. I am expecting enhancement in natural language understanding and complex reasoning. Moreover, there will be progress in unsupervised learning and reinforcement learning. In terms of applications, I think we will see advances in personalization in various domains. i.e., the systems will become more flexible and tailored to the user preferences and needs, e.g., personal assistants, smart home, customer service, etc. Hanie Sedghi will be speaking at the Machine Intelligence Summit, taking place alongside the Machine Intelligence in Autonomous Vehicles Summit in San Francisco this week on 23-24 March. Meet with and learn from leading experts about how AI will impact transport, manufacturing, healthcare, retail and more. Tickets are now limited, register to attend here.

Other confirmed speakers include Nick Pentreath, Principal Engineer at IBM; Robinson Piramuthu, Head of AI – Computer Vision at eBay; Melody Guan, Deep Learning Resident at Google Brain; Ankur Handa, Research Scientist at OpenAI; and Zornitsa Kozareva, Manager of Machine Learning for Alexa at Amazon. View more speakers and topics here.

The Machine Intelligence Summit and Machine Intelligence in Autonomous Vehicles Summit will also take place in Amsterdam on 28-29 June. Book your place here.