Deep Reinforcement Learning for the Popular Game tag

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Reinforcement learning can be compared to howhumans learn – by interaction, which is the fundamental conceptof this project. This paper aims to compare three differentlearning methods by creating two adversarial reinforcementlearning models and simulate them in the game tag. The threefundamental learning methods are ordinary Q-learning, Deep Qlearning(DQN), and Double Deep Q-learning (DDQN).The models for ordinary Q-learning are built using a table andthe models for both DQN and DDQN are constructed by using aPython module called TensorFlow. The environment is composedof a bounded square with two obstacles and two agents withadversarial objectives. The rewards are given primarily basedon the distance between the agents.By comparing the trained models it was established that onlyDDQN could solve the task well and generalize, whilst both theQ-model and DQN had more serious flaws. A comparison ofthe DDQN model against its average reward trends establishedthat the model still improved regardless of the constant averagereward.Conclusively, DDQN is the appropriate choice for this adversarialproblem whilst Q-learning and DQN should be avoided.Finally, a constant average reward can be caused by bothagents improving at a similar rate rather than a stagnation inperformance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)