Emotions influence every aspect of our lives – from how we connect and communicate, our health and wellbeing, and how we make decisions. So deep insight into consumers’ unfiltered and unbiased emotional reactions to digital content is an ideal way to judge the content’s likability and effectiveness.Affectiva has a vision is to bring emotion intelligence to our digital experiences. For the first-time ever, they can capture people's facial emotions globally and at scale, and have amassed the world's largest emotion data repository - 2.6 million faces collected from over 75 countries amounting to over 8 billion emotion data points. This is data that has never existed before, and is being used by businesses around the world to derive emotion insights around how people engage with digital content, and also drive real-time interactions based on a user's emotional state. Daniel McDuff is Principal Research Scientist at Affectiva, where he builds and utilizes scalable computer vision and machine learning tools to enable the automated recognition and analysis of emotions and physiology, leading the analysis of the largest database of human emotions. At the Deep Learning Summit, Daniel will present and discuss the value of adding emotional intelligence to digital experiences. We caught up with Daniel ahead of the summit next month to hear his thoughts on recent advancements in deep learning, and what we can expect in the future. What are the key factors that have enabled recent advancements in deep learning? The volume of labeled (and unlabeled) training data has been a major factor in the recent advancements in deep learning. Although almost all learning methods benefit from greater numbers of training examples, deep learning methods have demonstrated particular success at leveraging large corpuses of data. In the field of computer vision, and specifically object detection, the publicly available ImageNet dataset has played a significant role in helping advance the state-of-the-art. Such datasets act both as huge training corpuses that most researchers would not be able to compile themselves and also as a benchmark for comparing approaches.  In addition, the computing power that is now available and the tools (such as cudNN) that allow researchers to easily leverage GPU performance are key factors. It is no longer necessary for researchers to have access to hardware as costly as the "Google Brain”. What are the main types of problems now being addressed in the deep learning space? There are many problems currently being addressed in the deep learning space. Object detection and speech recognition are arguably the two research problems that have received the most attention.  However, the success of deep learning techniques within those areas has led to them being applied in many other domains. This is only likely to increase in the coming years as deep learning models are now easier to train and are consistently out-performing other methods.What are the practical applications of your work and what sectors are most likely to be affected? Affectiva are currently leveraging deep learning for the automated detection of facial expressions and emotions.  Facial expression detection is a challenging computer vision problem, especially when (as is the case in real-life) the muscle movements are very subtle and vary in appearance from person to person. Deep learning methods offer the possibility for further improving the accuracy of our expression detection software and allowing us to detect a greater number of behaviors.  We are applying emotion recognition in a number of sectors including: media measurement, mobile gaming and education. However, we believe the technology will have value in many other domains too.What developments can we expect to see in deep learning in the next 5 years? Deep learning is already promising to be the dominant form of machine learning within computer vision, speech analysis and a number of other areas.  I hope that the ability to build accurate recognition systems with the computing power available from one or two GPUs will allow researchers to develop and deploy new software in the real-world. I expect that more focus will be given to unsupervised training and/or semi-supervised training algorithms, as the amount of the data only continues to increase.What advancements excite you most in the field? The ability to leverage large amounts of unlabeled data is extremely exciting for the field of affective computing. We are now able to collect large amounts of video data; however, human video coding is still a very time consuming, expensive and challenging obstacle for emotion research. Any techniques that show the potential for learning from unlabeled, semi-labeled or noisily labeled data are exciting prospects.The Deep Learning Summit is taking place in Boston on 26-27 May. For more information and to register, please visit the event website here.
Join the conversation with the event hashtag #reworkDL