Evaluating recommendation systems for a sparse boolean dataset

University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

Author: Jonas Daniels; [2016]

Keywords: ;

Abstract: Recommendation systems is an area within machine learning that has become increasingly relevant with the expansion of the daily usage of technology. The most popular approaches when making a recommendation system are collaborative filtering and content-based. Collaborative filtering also contains two major sub approaches memory-based and model-based. This thesis will explore both content-based and collaborative filtering to use as a recommendation system on a sparse boolean dataset. For the content-based filtering approach term frequency-inverse document frequency algorithm was implemented. As a memory-based approach K-nearest neighbours method was conducted. For the model-based approach two different algorithms were implemented, singular value decomposition and alter least square. To evaluate, a cross-approach evaluator was used by looking at the recommendations as a search, a search that the users were not aware of. Key values such as the number of test users who could received a recommendation, time consumption, F1 score (precision and recall) and the dataset size were used to compare the methods and reach conclusions.  The finding of the study was that collaborative filtering was the most accurate choice when it comes to sparse datasets. The implemented algorithm for the model-based collaborative filtering that performed most accurate was Singular value decomposition without any regularization against overfitting. A further step of this thesis would be to evaluate the different methods in an online environment with active users, giving feedback in real time.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)