An Introduction to Reinforcement Learning

We were delighted to be joined by Lex Fridman at the San Francisco edition of the Deep Learning Summit, taking part in both a ‘Deep Dive’ session, allowing for a great amount of attendee interaction and collaboration, alongside a fireside chat with OpenAI Co-Founder & Chief Scientist, Ilya Sutskever. The MIT Researcher shared his thoughts on recent developments in AI and its current standing, highlighting its growth in recent years.

“It’s amazing how much has happened in recent history, the industrial revolution was just 300 years ago, human beings started creating automated processes during this time. Since then, we have seen a gigantic leap toward machines that are learning for themselves!”

Lex then referenced, Lee Sedol, the South Korean 9th Dan GO player, whom at this time is the only human to ever beat AI at a video game, which has since become somewhat of an impossible task, describing this feat as a seminal moment and one which changed the course of not only deep learning but also reinforcement learning, increasing the social belief in the subsection of AI. Since then, of course, we have seen video games and tactically based games, including Starcraft become imperative in the development of AI.

The comparison of Reinforcement Learning to Human Learning is something which we often come across, referenced by Lex as something which needed addressing, with humans seemingly learning through “very few examples” as opposed to the heavy data sets needed in AI, but why is that? Lex gave what he thought were the three possible reasonings:

Hardware - 230 million years of bipedal movement data
Imitation Learning - Observation of other humans walking
Algorithms - Better back-propagation and stochastic gradient descent

Furthermore, it was suggested, to much laughter, that the most surprising and profound idea in Deep Learning is that it works at all. It’s still not fully understood that it can learn a function in a lean manner - it’s over parameterised but for now, we should be impressed by the current direction! Moving on to what he is most-known to socially commentate on during his podcasts and social media presence, autonomous driving! The segway to this came through the discussion of the rule-based systems methodology used in self-driving cars, which further allows for an increasing amount of reliability. That said, Lex also suggested that in its current state, the ability to surprise is not something overly welcomed. Does this really differ from humans though?

“There hasn’t been a truly intelligent rule-based system which is reliable (in AI), however, human beings are not very reliable or explainable either.”

Lex went on to further break this quote down suggesting that whilst a black-box style of accuracy is expected from AI, we also place far too much pressure on its development, especially when you consider that regular human mistakes are forgiven when met with excuse, however, it takes only one AI-based mishap for news articles, blogs and media outlets (mostly outside of the industry) to write-off years of development. As humans, we don’t want to know the truth, we simply want to understand. When this isn’t possible, ideas and developments are often rejected as futile. The witty remark that DL needs to become more like the person in school who studies social rules, finding out what is cool and uncool to do/say, suggested that the openness and honesty seen in currently algorithms makes it easy for problems to meet a disgruntled audience.

So what’s next? Naive question perhaps, especially when development of RL systems is ongoing, however, Lex suggested that explainable AI was the next buzzword, something considered ‘sexy’ in the industry right now even if seemingly overused. What gets one of the top minds in AI excited in 2020?

RL robotics / Human Behaviour Development
Working with Boston Dynamics and collaborating on RL
Self-play is not written about enough. One of the most important things in AI, and a really powerful idea, which can explore objectives and behaviours

Start a free membership of our extensive Video Library here.

Speaker Bio

Lex Fridman is a researcher at MIT, working on deep learning approaches in the context of semi-autonomous vehicles, human sensing, personal robotics, and more generally human-centered artificial intelligence systems. He is particularly interested in understanding human behavior in the context of human-robot collaboration, and engineering learning-based methods that enrich that collaboration. Before joining MIT, Lex was at Google working on machine learning for large-scale behavior-based authentication.

https://videos.re-work.co/discover