Reinforcement Learning and the Game of Nim

University essay from KTH/Skolan för teknikvetenskap (SCI)

Author: William Lord; Paul Graham; [2015]

Keywords: ;

Abstract: This paper treats the concept of Reinforcement Learning (RL) applied to finding the winning strategy of the mathematical game Nim. Two algorithms, Q-learning and SARSA, were compared using several different sets of parameters in three different training regimes. Ananalysis on scalability was also undertaken. It was found that tuning parameters for optimality is difficult andtime-consuming, yet the RL agents did learn a winning strategy, in essentially the same time for both algorithms. As for scalability, it showed that increased learning time is indeed a problem in this approach. The relevance of the different training regimes as well as other conceptual matters of the approach are discussed. It is concluded that this usage of RL is a promising method, although laborious to optimize in this case and quickly becomes ineffective when scaling up the problem. Ideas are discussed and proposed for future research on solving these limiting factors.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)