Reinforcement Learning for Pickup and Delivery Systems

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Erika Sandhagen; Sarah Magnusson; [2023]

Keywords: ;

Abstract: In this project multi-agent reinforcement learning (RL) for a warehouse environmentwith robots delivering packages has been studied. This was done by first implementing the RLalgorithm Q-learning and investigating how the parameters of Q-learning affect the performanceof the algorithm. Q-learning was successfully implemented both with centralized anddecentralized learning. The performance of these implementations were compared and theresults show a faster convergence for the decentralized version. Due to the large memoryrequirements for the implementation of Q-learning only relatively small environments werepossible to simulate.Due to these limitations of Q-learning, a Deep-Q Network (DQN) was implemented to try toachieve scalability. Unfortunately initial problems with the convergence of the network and laterlong run-times lead to a shortage of time for studying DQN and the scalability of DQN was notinvestigated that thoroughly. For DQN convergence was achieved and confirmed for a 3x3 gridwith one agent and one package. Although the results are not fully sufficient, with more tuning ofthe hyperparameters of the network and a more effective implementation of the environment,DQN seems to be a promising extension of Q-learning for the environment presented in theproject.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)