This guest article was written by Susan Kuchinskas, a journalist and author who covers science and technology. Visit her website to read more of her articles here.Driverless cars will need to make the same complex, split-second decisions in traffic that people do before we set those autonomous vehicles loose on public roads. A new way of training automotive vision systems that relies on an artificial intelligence training technique called “deep learning” could shorten the road to full autonomy.
Deep learning is an area of machine learning that uses ranks of processors formed into layered neural networks to enable computer systems to learn from huge amounts of data – five to 20 times faster than previous.
Before deep learning, engineers trained automotive computer vision systems by showing them images and video of things they want them to recognize. The method is laborious: The set of images used for training recognition of a stop sign, for example, must include shots taken from many directions and in different placements. Even then, it's not possible to train a computer vision system to recognize every eventuality. Deep learning enables the system to infer that an object is a stop sign after seeing just one example.
Deep learning leverages layers of computing processors working in parallel. Each processor in a layer works on just a small part of the image recognition problem. The results from that layer are combined and passed up to the next layer, which performs slightly more complex tasks. If the image is a stop sign, for example, the first layer of artificial neurons might identify edges of all the objects in the image. The next layer might group those edges together to define shapes, while succeeding layers would aggregate shapes together to form the object.
The use of deep learning got its start in 1999 when Yann LeCun, now director of artificial intelligence research at Facebook, used it to perform handwritten-character recognition. It got hot in 2012, when a team from the University of Toronto led by Alex Krizhevsky (now at Google) used deep learning to place first in the Large Scale Visual Recognition Challenge, an international competition organized by Stanford University.
Google recently elevated deep learning to buzzword status with the release of the source code for Deep Dream, a way of visualizing what's going on in each layer of a neural network. This flooded the Internet with bizarre images of dog-snails and eye-studded pagodas.
Krizhevsky's team made three innovations, according to Mike Houston, a distinguished engineer working on deep learning for chipmaker NVidia. They used a much deeper neural network, that is, there were many more layers of processors. The researchers also randomly eliminated some of the data being processed at each layer, and they used faster, more powerful computer chips designed for graphics (GPUs) instead of standard computer processing units (CPUs). "It's that work that is largely credited with bringing back modern neural networks,” Houston says. “Now we see it everywhere."
Better Chips, Bigger TasksCheaper, more powerful GPUs let automakers expand the scope of their autonomous ambitions.
"You can't do serious work on deep learning and large-scale problems without GPUs,” says Yoshua Bengio, a professor in the University of Montreal’s Department of Computer Science and Operations Research and head of the university’s Machine Learning Laboratory (MILA). “With the kind of models we train for object recognition, such as those for autonomous cars, it might take two weeks with a GPU. With a CPU, that might be one year – something nobody wants to do." Another big advance, according to Bengio, came when NVidia began publishing code libraries that engineers could use instead of having to write their own low-level code.
In March, NVidia announced Drive PX, describing it as a self-driving car computer powered by deep learning. The development kits include a deep neural network – a "brain" -- that's already trained by showing it a variety of images so that, for example, it can distinguish between a delivery van and an ambulance. Automakers and suppliers can use Drive PX to help perfect their proprietary computer-vision systems.
GPUs are not the only way to approach deep learning for automotive computer vision systems. Tesla Motors, General Motors, Audi and other carmakers are using Mobileye's vision-processing system on a chip, which includes software to integrate it into vehicle computer-vision systems. CogniVue provides chips for computer vision systems that perform deep learning functions on high-resolution video in automotive systems. These purpose-built chips provide an alternative to GPUs.
New New ThingAutomakers and their suppliers are either already using deep learning in their advanced computer-vision systems or working with it in R&D. Audi says deep learning is at the core of its autonomous vehicle development efforts. NVidia says it's working with more than 50 automakers, including Tesla.
GM is using the technology in its automated and autonomous driving research. "With conventional computer-vision methodology, there is only so much we can do,” says Cem Saraydar, a director in GM' global R&D organization. “Deep learning will lead to significantly improved results."
Volvo Cars sees potential for deep learning beyond object detection. Says Volvo's says Erik Coelingh, senior technical leader, safety and driver support technologies, "With deep learning there is the possibility to get information on the intention of the pedestrian, for example, if someone is about to cross the road."
Google didn't make researchers available for this article, but Google researcher Anelia Angelova, along with Krizhevsky and others, demonstrated a faster method of detecting pedestrians on the street using a combination of three GPU-based neural networks.
Google's team won the 2014 ImageNet Challenge with an error rate of 6.8 percent, but a trained human beat that with an error rate of 5.1 percent. In February 2015, a team from Microsoft bested Google and human on the same task, with an error rate of just 4.94 percent. In other words, computers are now better at object recognition than humans. Maybe they will be better drivers.
Learn more about the collision of connected cars, IoT and machine learning at RE.WORK Connect Summit in San Francisco on 12-13 November. For more information, visit the event page here.Image by Josh Sniffen altered by DreamDeeply.com Would you like to be our next guest author? Find out how here!
This is a guest blog and may not represent the views of RE.WORK. As a result some opinions may even go against the views of RE.WORK but are posted in order to encourage debate and well-rounded knowledge sharing, and to allow alternate views to be presented to the RE.WORK community.