Dr. Jürgen Sturm heads the machine learning efforts at Metaio GmbH, the world-leading Augmented Reality technology provider. He and his team research deep learning techniques such as random forests to track and augment the human body on camera images. The goal of Metaio’s machine learning efforts is to create immersive virtual shopping experiences, for example, to try on sunglasses or earrings. As Jürgen will be speaking at the Deep Learning Summit in San Francisco next month, we caught up with him to hear his thoughts on deep learning and augmented reality.What are the key factors that have enabled recent advancements in deep learning?The appearance of huge training datasets such as ImageNet have boosted progress in this field enormously. Deep learning algorithms now enable us to train better classifiers and to adapt them more quickly to our customer demands. In my talk, I will showcase with our solutions for detecting, tracking, and augmenting faces in real-time for virtual shopping applications.

What are the main types of problems now being addressed in the Neural Network space?Deep learning has been proven to be very successful for image recognition and classification tasks, where the image data can be filtered layer by layer of the neural network. However, other problems involving geometric computation (such as SLAM) can be solved at ease by humans but are difficult for today’s CNN architectures.What are the practical applications of your work and what sectors are most likely to be affected?For Metaio, as the world leading provider of augmented reality, we are constantly looking for methods to improve the customer experience in virtual shopping applications. This involves mapping and scene understanding (How does the world look like? What objects/persons/faces are present in the scene, how do they look like and where are they in the image?), camera re-localization and tracking (Where is the camera?), and light estimation (Where does the light come from?). For all of these tasks, we see a huge potential for deep learning and other machine learning techniques.What advancements excite you most in the field?Deep learning has led to significant improvements of various recognition tasks in computer vision: The performance has nearly doubled every year, while the size of datasets has almost grown exponentially. At the same time, the learned networks require more and more memory (and thus processing power). Therefore, we are actively researching on methods to improve the efficiency, so that these approaches become applicable to mobile devices and wearables. To achieve this, we investigate both algorithmic solutions as well as power-efficient hardware implementations.

The Deep Learning Summit is taking place in San Francisco on 29-30 January. You can get more information and register here.