A California based startup has developed a deep learning app for everyday consumers. The creators of the app, Orbeus, provide image and face recognition solutions for businesses through their Rekognition API engine, and their product Phototime, a photo-tagging app with a difference, that makes sense of not just faces but also scenes and objects.
Their approach combines of state-of-the-art deep learning approaches with proprietary modifications to meet the wide range of use-cases from traditional portraits to anything else you might take a picture of with your smartphone, such as whiteboard notes or ID cards. What sets Orbeus and Phototime apart from most other image recognition companies is the ability to go further than identifying just the face. Phototime can recognize, categorize and tag all aspects of a photo will startling accuracy – easily labelling photos with in-depth knowledge of ‘when’, ‘where’, and ‘what’.
At the RE.WORK Deep Learning Summit last month, Yi Li, CEO of Orbeus, discussed the technology behind Phototime and their Rekognition API. She also revealed how engine performance is directly linked to the volume of high quality data it is fed, a sentiment echoed by Greg Corrado, Senior Research Scientist at Google, who also spoke at the summit.
We caught up with Yi following her presentation to hear her thoughts on advancements in the field, and what we can expect to see in the future for deep learning and Orbeus.
What did you find most valuable at the Deep Learning Summit?
The extent and pace that Deep Learning is impacting so many aspects of our lives. As the technology matures, practical solutions for everyday challenges are becoming possible. Orbeus' PhotoTime app is one example. We refined our deep learning technology to help you find the photos that you want, when you want, regardless of where you have them stored. We also offer face authentication and recognition services for businesses. At present, Orbeus' relies on cloud computing for optimized results, which suits many of our customers' requirements. In the future, we hope to bring comparable quality to devices to meet the demand for running offline.
We enjoyed exchanging perspectives with Andrew Ng, and many attendees at the conference. As is the consensus, we believe that killer applications are required to make Deep Learning appreciated outside of the scientific community. Therefore, we continue to work hard to improve our technology, but also listen to our customer to meet to their use-case demands.
What do you feel are the leading factors enabling recent advancements in deep learning?
We feel that hardware and data are the driving forces of this wave of the deep learning revolution. Outside of the scientific community, most people do not realize that the deep learning algorithms are nothing new. This of course came to light in 2012 because more data and computing power became available, confirming the power of deep learning. Thanks to GPU providers like Nvidia, who made GPU computing a lot easier, making training a large network with big data possible.
Development in hardware, especially powerful GPUs as well as some efficient open-source implementation toolbox like CAFFE and Cuda-Convnet made it feasible to apply the Deep learning algorithms for practical use. And, due to the large number of parameters in the Deep Learning models, it requires large number of data to train the model in order to achieve good accuracy. So data, especially high-quality data with less noisy labels, are another very important factor for the advancements in deep learning.
Which industries do you think will be disrupted by deep learning in the future?
I think that deep learning to be used in many area in our lives, even some that we haven't yet imagined. Voice recognition software, like Siri, Google and Nuance have quickly become established. We will likely see a lot more smart home products soon, and cars too. Anywhere data, especially unstructured data, would be excellent candidates. Examples include financial, medical or commercial applications.
Deep learning has already shown its great power in speech and vision. By integrating Natural Language Processing, the Advertising and Entertainment industries will likely have disruptions in the near term. When deep learning could understand text, photo and video content, more accurate analysis related to the online activity and thus more targeted advertisements could be served up.
What do you feel is essential to future progress?
1. Exchange ideas about which other areas deep learning can thrive.
2. GPUs with far better computational power and larger memory, as well as multi-GPU direct communication will enable better DNN models, which could handle tremendous scale of data.
3. How to obtain useful data from the unstructured and noisy explosive data, and develop algorithms that can actively seek useful structure or pattern in an active and unsupervised way
What developments can we expect to see in deep learning in the next 5 years?
We believe that:
1. Object recognition will be much more accurate
2. Less power-consuming GPUs or mobile computational chips will be available
3. Visual Recognition and Speech recognition limitations will be solved
4. More useful applications based on deep learning will appear
What do you see in the future for Orbeus?
Orbeus will take an algorithmic approach to help keep your memories accessible and safe, so you never need to worry about losing them, by developing a cloud-local AI system to analyze all of your personal photos or videos across different platforms. Orbeus will make all of your memories searchable and understand your life, and priorities, more comprehensively.
Super Early Bird tickets are now available for the next Deep Learning Summit, which will take place in Boston this May.