Hugo Larochelle is Research Scientist at Twitter Cortex, and Assistant Professor at the Université de Sherbrooke. Prior to this, he spent two years in the Machine Learning Group at the University of Toronto, as a postdoctoral fellow under the supervision of Geoffrey Hinton, and obtained his PhD at the Université de Montréal, under the supervision of Yoshua Bengio. He is also Associate Editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence and is an editorial board member for the Journal of Artificial Intelligence Research.  

At the RE•WORK Deep Learning Summit in Boston, Hugo discussed the reasons for the current success in the industry of deep learning, the future directions for research, with examples of how they are using deep learning at Twitter to learn representations of tweets, images, videos and users. I caught up with him ahead of the event to hear more.    

What started your work in deep learning? I did my PhD under the supervision of Yoshua Bengio, from 2004 to 2009, which was the period during which the term "deep learning" came to life and the subject started gaining in popularity. In 2006, I worked with Yoshua on demonstrating that training autoencoder neural networks could help in initializing deep neural networks (paper "Greedy Layer-Wise Training of Deep Networks"). This then led to the proposal of the popular denoising autoencoder, thanks to a collaboration with Prof. Pascal Vincent (paper "Extracting and Composing Robust Features with Denoising Autoencoders"). Since then, I've remained very interested and involved in the development of unsupervised learning algorithms with neural networks.  

What are the key factors that have enabled recent advancements in deep learning? Some of these factors are fairly well known: ability to use GPUs for training on very large datasets and advances in the learning algorithms and architectures for neural networks.  But another that I think is rarely mentioned is that the neural network community itself has been changing, from my perspective at least. Indeed, it seems to have become better at taking inspiration and lessons from other subfields of machine learning, such as probabilistic graphical modeling, bayesian learning, optimization and reinforcement learning. It has also become better at making the latest advances accessible and easy to use by the larger community, through the development of excellent deep learning libraries.  

What are the main types of problems now being addressed in the deep learning space? It seems clear that there is a significant resurgence of successful research on applying deep learning to unsupervised learning and reinforcement learning. Another type of problem gaining in popularity is the development of architectures that go beyond simple feedforward architectures (such as neural turing machines, memory networks, attentional models and many more), in the hope to better tackle more complex forms of intelligent reasoning.  

How are you currently applying deep learning methods at Twitter? I am part of Twitter Cortex, a team of engineers, data scientists, and machine learning researchers dedicated to building a unifying representation of all of the content on Twitter, is tasked with the development of machine learning models and representations for all of Twitter's data: everything from text in Tweets, images, videos, user interactions and more. Deep learning technology enhances our ability to identify and surface the most relevant content on Twitter to our users.  To do so effectively, we are working on identifying the topic of Tweets, classifying videos into different interest types, producing short textual descriptions of images, understanding people’s interests on the platform, predicting engagement behaviors with Twitter content, and much more. In addition to developing mathematical models for Twitter data, we have also developed software that can implement these models. A lot of this software development has been open sourced and made publicly available. So far, we've released Torch Autograd, a library for very rapidly implementing deep learning models, as well as libraries to facilitate the training of deep learning models at scale, on many machines in parallel.  

What developments can we expect to see in deep learning in the next 5 years? If we continue to see the type of progress we're currently seeing on deep reinforcement learning, I would be disappointed if we didn't see more successful applications of deep learning in contexts that involve interacting with users or adapting to their individual behavior. These could include any form of personalization, or even the ability for computers to understand and interact in certain forms of dialogue with humans.  

What advancements excite you most in the field? Any advancement that aims at reducing the reliance of deep learning on large quantities of labelled data is exciting to me, as this reliance is a key (and stubborn) barrier for the application of deep learning to more problems. This is one of the reasons why I've always been interested in unsupervised learning, as it could be part of the solution. Unsupervised learning is currently a very popular topic. However there are other subjects that I think are overlooked. I'm particularly interested in any progress on algorithms for lifelong learning (the application of learning over very long time periods) and meta learning (algorithms that learn how to learn). These are probably currently under-explored because they are quite hard. However, not only do they seem like key elements for ultimately achieving meaningful forms of AI, progress in the short term could also facilitate the application of machine learning in practice.

Hugo Larochelle has spoken at several RE•WORK events throughout his career. View the most recent presentation here: