Developing Machine Learning-based Recommender System on Movie Genres Using KNN

University essay from Stockholms universitet/Institutionen för data- och systemvetenskap

Abstract: With an overwhelming number of movies available globally, it can be a daunting task for users to find movies that cater to their individual preferences. The vast selection can often leave people feeling overwhelmed, making it challenging to pick a suitable movie. As a result, movie service providers need to offer a recommendation system that adds value to their customers. A movie recommendation system can help customers in this regard by providing a process that assists in finding movies that match their preferences. Previous studies on recommendation systems that use Machine Learning (ML) algorithms have demonstrated that these algorithms outperform some of the existing recommendation methods regarding recommendation strategy. However, there is still room for further improvement, especially when it comes to exploring scenarios where users need to spend a considerable amount of time finding movies related to their preferred genres. This prolonged search for the right movies can give rise to problems such as data sparsity and cold start. To address these issues, we propose a machine learning-based recommender system for movie genres using the K-nearest Neighbours (KNN) algorithm. Our final system utilizes a slider bar on a Streamlit web app, allowing users to select their preferred movies and see recommendations for similar movies. By incorporating user preferences, our system provides personalized recommendations that are more likely to meet the user's interests and preferences. To address our research question: “How and to what extent can a machine learning-based recommender system be developed focusing on movie genres where movie popularity can be predicted based on its content?” we propose three main research objectives. Firstly, we investigate the employment of a classification algorithm in recommending movies focusing on interest genres. Secondly, we evaluate the performance of our classification algorithm concerning movie viewers. Thirdly, we represent the popularity of movie genres based on the content and investigate how this representation can inform the movie recommendation algorithm. On the heels of an experimental strategy, we extract and pre-process a dataset of movies and their associated genre labels from Kaggle. The dataset consists of two files derived from The Movie Database (TMDB) 5000 Movie Dataset. We develop a machine learning-based recommender system based on the similarity of movie genres using the extracted and pre-processed dataset. We vary the KNN algorithm with a slider bar to recommend movies of varying similarity to the selected movie, ranging from similar to diverse in genre. This approach can suggest movies with different titles for users with diverse preferences. We evaluate the performance of the KNN classification algorithm using a user's interest genres, measuring its accuracy, precision, recall, and F1-score. The algorithm's accuracy ranges from low to moderate across different values of K, indicating its moderate effectiveness in predicting user preferences. The algorithm's precision ranges from moderate to high, implying that it provides accurate recommendations to the user. The recall score improves with increasing K and reaches its maximum at K=15, demonstrating its ability to retrieve relevant recommendations. The algorithm achieves a good balance between precision and recall, with an average F1-score of 0.60. This means that the algorithm can accurately identify relevant movies and recommend them to users with a high degree of accuracy. Furthermore, our result shows that the popularity visualization technique using KNN is a powerful tool for analysing and understanding the popularity of different movie genres, which can inform important decisions related to marketing, distribution, and production in the movie industry. In conclusion, our machine learning-based recommender system using KNN for movie genres is a game changer. It allows users to select their preferred movies and see recommendations for similar movies using a slider bar on a Streamlit web app. If confirmed by future research, the promising findings of this thesis can pave the way for developing and incorporating other classification algorithms and features for movie recommendation and evaluation. Furthermore, the adjustable slider bar ranges on the Streamlit web app allow users to customize their movie preferences and receive tailored recommendations.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)