Deep reinforcement learning i distribuerad optimering

University essay from KTH/Skolan för teknikvetenskap (SCI)

Author: Marcus Lindström; Jahangir Jazayeri; [2018]

Keywords: ;

Abstract: Reinforcement learning has recently become a promising area of machine learning with significant achievements in the subject. Recent successes include surpassing human experts on Atari games and also AlphaGo becoming the first computer ranked on the highest professional level in the game Go, to mention a few. This project aims to apply Policy Gradient Methods (PGM) in a multi agent environment. PGM are widely regarded as being applicable to more problems than for instance Deep Q-Learning but have a tendency to converge upon local optimums. In this report we aim to explore if an optimal policy is achievable with PGM in a multi-agent framework. Numerical simulations implementing the aforementioned method in an environment with up to 4 agents and moving obstacles showed a convergence and the efficiency of the approach. A relatively small amount of collisions took place once the learnt agents were tested. These result differed when changing some parameters such as learning rates and number of neurons in the neural network. The conclusion was that a very fast convergence upon at least a local optimal policy was achieved in this setting.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)