Reinforcement Learning in Problems with Continuous Action Spaces : a Comparative Study

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Axel Larsson; [2021]

Keywords: Actor-critic; deep learning; machine learning; reinforcement learning; Q-learning.; Aktör-kritiker; djupinlärning; förstärkningsinlärning; maskininlärning; Q- inlärning.;

Abstract: Reinforcement learning (RL) is one of the three main areas in machine learning (ML) with a solid theoretical background and progress. Generally, RL can provide solutions to many real- world applications, such as self-driving cars and protein folding. A class of RL problems with an infinite number of actions from each state has recently received significant attention, namely infinite action space RL problems. There are several standard algorithms for RL problems, and depending on the nature of the problem, one should choose a proper RL algorithm which can be a challenging task. To compare RL algorithms, we carefully implement them on different tasks and store the relevant results. To have a fair comparison, we tune the algorithms and iteratively test and update them beforehand. This study compares four different RL algorithms. Our results show that the RL algorithms that store the steps of their path, or have a model for the environment, have the highest rate of convergence. By updating the value of every step of the path after a reward, instead of only looking backward a single step, the algorithms find a solution faster and more often. Having a model to help the algorithm plan ahead also contributed to faster and more stable learning. RL algorithms that use a deep neural network for evaluation are the least stable. Our results can provide a good basis for selecting appropriate algorithms for infinite action space RL problems. It can be built upon, simplifying the development of improvements for researchers on the RL algorithms that exist today.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Reinforcement Learning in Problems with Continuous Action Spaces : a Comparative Study

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-27)