Neural Abstraction Pyramid to Semantic RGB-D Perception

Dr. Sven Behnke is Professor and Head of Computer Science at the University of Bonn, where he also heads the Autonomous Intelligent Systems group. At RE.WORK Deep Learning Summit in London, Sven will discuss the Neural Abstraction Pyramid, a deep learning architecture, in which layer-by-layer unsupervised learning creates increasingly abstract image representations. The presentation will also focus on more recent work on deep learning for object-class segmentation of images and semantic RGB-D perception.

We caught up with Sven ahead of the summit on 24-25 September to hear more about his work and the recent advancements in deep learning.

What are your recent developments on the Neural Abstraction Pyramid?
Our recent developments include discriminative training through max-pooling units, object-class segmentation for color and RGB-D images, training with a structured loss for object detection, and the application of recurrent deep networks to surface categorization in RGB-D video.

What sectors will be impacted by your research in deep learning for object-class segmentation of images and semantic RGB-D perception?
The biggest impact of our work will be on autonomous intelligent systems, in particular robots. The categorization of surfaces, and the detection and pose estimation of objects is necessary for planning robot actions in complex, semi-structured domains.

What are the key factors that have enabled recent advancements in deep learning?
I see three key factors for the recent advances in deep learning. First of all, algorithmic advances, such as discriminative training through max-pooling units, drop-out regularization, and deep architectures which better match the statistics of the data and improve performance on existing problems. Secondly, the collection and annotation of large data sets, such as ImageNet, created new challenges, where deep learning of non-linear, convolutional feature extraction outperformed other methods. Finally, the availability of general-purpose GPUs and other affordable parallel computers made it possible to train deep architectures on large data sets and to optimize their hyper-parameters.

What are the main types of problems now being addressed in the deep learning space?
Deep learning methods are mostly applied to pattern recognition problems, such as image categorization, object detection in image, object-class segmentation of images, and speech recognition.

Which industries do you think will be disrupted by deep learning in the future? And how?
Deep learning will have a big impact on industries, which currently rely heavily on human perception. An example for this is street transportation, where human drivers are needed to perceive the traffic situation and to control a vehicle. Perception systems based on deep learning from multiple sensor modalities will reliably create environment models for planning autonomous driving. Another industry where similar techniques will have a big impact is surveillance, e.g. of public spaces, where deep learning scene understanding could replace human operators monitoring surveillance video.

What developments can we expect to see in deep learning in the next 5 years?
I expect deep learning methods to be applied to increasingly multi-modal problems with more structure in the data. This will open new application domains for deep learning, such as robotics, data mining, and knowledge discovery.

What advancements excite you most in the field?
I find the recent trend towards recurrent networks exciting. Scientific developments often advance in a spiral-like pattern. After the re-discovery of feed-forward convolutional neural networks, the use of recurrence is a logical next step for the processing of video streams, the iterative interpretation of ambiguous stimuli, and the modelling of recursive structure in the data.

The Deep Learning Summit is taking place in London on 24-25 September, alongside the Future Technology Summit.

Early Bird price tickets end this week - book your place by Friday 31 July to save £200! For more information and to register, please visit the event website here.

Neural Abstraction Pyramid to Semantic RGB-D Perception

Community News - 5 Aug 2015

Presenting the Emerging Technology Pioneers of 2015