Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition
Abstract: Spaced repetition is a learning technique in which content to be learned or memorized is reviewed multiple times with gaps in between for efficient memorization and practice of skills. Two of the most common systems used for providing spaced repetition on e-learning platforms are Leitner and SuperMemo systems. Previous work has demonstrated that deep reinforcement learning (DRL) is able to give performance comparable to traditional benchmarks such as Leitner and SuperMemo in a flashcard based setting with simulated learning behaviour. In this work, our main contribution has been introduction of two new reward functions to be used by the DRL agent. The first, is a realistically observable reward function that uses the average of sum of outcomes in a sample of exercises. The second uses a Long Short Term Memory (LSTM) network as a form of reward shaping to predict the rewards to be used by DRL agent. Our results indicate that in both cases, DRL performs well. But, when LSTM based reward function is used, the DRL agent learns good policy smoother and faster. Also, the quality of the student-tutor interaction data used to train the LSTM network displays an effect on the performance of the DRL agent.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)