Recommending digital books to children : Acomparative study of different state-of-the-art recommendation system techniques

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Collaborative filtering is a popular technique to use behavior data in the form of user’s interactions with, or ratings of, items in a system to provide personalized recommendations of items to the user. This study compares three different state-of-the-art Recommendation System models that implement this technique, Matrix Factorization, Multi-layer Perceptron and Neural Matrix Factorization, using behavior data from a digital book platform for children. The field of Recommendation Systems is growing, and many platforms can benefit of personalizing the user experience and simplifying the use of the platforms. To perform a more complex comparison and introduce a new take on the models, this study proposes a new way to represent the behavior data as input to the models, i.e., to use the Term Frequency-Inverse Document Frequency (TFIDF) of occurrences of interactions between users and books, as opposed to the traditional binary representation (positive if there has been any interaction and negative otherwise). The performance is measured by extracting the last book read for each user, and evaluating how the models would rank that book for recommendations to the user. To assess the value of the models for the children’s reading platform, the models are also compared to the existing Recommendation System on the digital book platform. The results indicate that the Matrix Factorization model performs best out of the three models when using children’s reading behavior data. However, due to the long training process and larger set of hyperparameters to tune for the other two models, these may not have reached an optimal hyperparameter tuning, thereby affecting the comparison among the three state-of-the-art models. This limitation is further discussed in the study. All three models perform significantly better than the current system on the digital book platform. The models with the proposed representation using TF-IDF values show notable promise, performing better than the binary representation in almost all numerical metrics for all models. These results can suggest future research work on more ways of representing behavior data as input to these types of models.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)