Deep Reinforcement Learning - What’s all the fuss about?

Deep Reinforcement Learning (DRL) is praised as a potential answer to a multitude of application based problems previously considered too complex for a machine. The resolution of these issues could see wide-scale advances across different industries, including, but not limited to healthcare, robotics and finance.

Deep Reinforcement Learning is the combination of both Deep Learning and Reinforcement Learning, but how do the two combine and how does the addition of Deep Learning improve Reinforcement Learning? We have heard the term Reinforcement Learning for some time, but over the last few years, we have seen an increase in excitement around DRL and rightly so. Simply put, a Reinforcement Learning agent becomes a Deep Reinforcement learning agent when layers of artificial neural networks are leveraged somewhere within its algorithm. The easiest way of understanding DRL as cited in Skymind's guide to DRL is to consider it in a video game setting. Before jumping headfirst down the DRL rabbit hole, I will first cover some key concepts which will be referenced throughout.

1. Agent - The agent is just as it sounds, an entity which takes actions. You are an agent, and for the benefit of this blog, Pacman is an agent. The agent itself learns by interacting with an environment, but more on this later.

2. Environment - The environment is the physical world in which the agent operates and performs, the setting of a game, for example.

3. State - The state refers to the immediate or present situation an agent finds itself in, this could be both presently or on the future. A configuration of environmental inputs which create an immediate situation for the agent.

4. Action - The action is the variety of possible decisions or movements an agent can make. The example here could be a decision needed in Pacman (I know, bare with me), in regard to moving up, down, left or right in order of avoiding losing a life.

5. Reward - The reward is the feedback, both positive and negative, which allows the agent to understand the consequences of its actions in the current environment it finds itself in.

DRL is often compared to video games due to its similarity in processes. Let me explain: imagine that you are playing your favourite video game and you find yourself easily completing all levels on medium difficulty, so you decide to step up. The chances are, the higher difficulty will cause you to regularly fail due to unforeseen obstacles which present themselves. If you’re persistent, you will learn the obstacles and challenges throughout each level, which can then be beaten or avoided the next time. Not only this, but you would be learning each step in the hope of receiving maximum reward, be it coins, fruit, extra lives etc. Reinforcement learning works much in this way, with the particular agent evaluating it’s current state in relation to the environment, gathering feedback throughout, with positive (gaining more points in a level) and negative (losing lives and having to start again) outcomes. It is then through this that an agent learns the easiest route to victory via trial and error, all the while teaching itself the best methods for each task. This then culminates in the near perfect methodology for a task, increasing efficiency and productivity.

Image Source - Sutton & Barto - Reinforcement Learning an Introduction

So, what’s all of the fuss about?

With the potential to use DRL for a range of tasks, including those which could improve the quality of hospital care and life enhancement, seen through a greater amount of patient care and regularly scheduled appointments. The increase in automation will also see the end to tedious and exhausting tasks being undertaken by humans, and if we are to believe reports, it will not stop there, with over half of today’s work activities in the automation firing line by 2055. That said, we are still at the very first steps of DRL. It must also be recognised that in its infancy, there are issues with DRL, with many claiming that it does not yet work and that whilst the hype should be recognised, there is a huge amount of further research needed. Can it solve all your problems, well, that depends who you ask, but there is no doubting the huge potential which could create wide-scale changes in the next few years!

The Deep Reinforcement Learning Summit is set to take place in San Francisco in June, bringing together the brightest minds currently working in the field, to discuss and present the latest industry research, theoretical breakthrough and application methods. From imitation and multi-task learning to swarm robotics and end-to-end learning, the summit will showcase academic advancements in DRL as well as their impact in business & industry. Where do the challenges still lie in research? How can companies leverage progress? Have all your questions and more answered in San Francisco!

If you have any questions or would like any further DRL focussed blogs, message Luke on our livechat (the orange button in the bottom right corner of your screen). Any feedback on our blogs is always welcome!

For more information on the agenda and speakers currently confirmed, visit the agenda. Secure your ticket to the summit before May 3rd to receive over 20% off your pass.