Identification of Problem Gambling via Recurrent Neural Networks : Predicting self-exclusion due to problem gambling within the remote gambling sector by means of recurrent neural networks

University essay from Umeå universitet/Institutionen för fysik

Abstract: Under recent years the gambling industry has been moving towards providing their customer the possibility to gamble online instead of visiting a physical location. Aggressive marketing, fast growth and a multitude of actors within the market have resulted in a spike of customers who have developed a gambling problem. Decision makers are trying to fight back by regulating markets in order to make the companies take responsibility and work towards preventing these problems. One method of working proactively in this regards is to identify vulnerable customers before they develop a destructive habit. In this work a novel method of predicting customers that have a higher risk in regards to gambling-related problems is explored. More concretely, a recurrent neural network with long short-term memory cells is created to process raw behaviour data that are aggregated on a daily basis to classify them as high-risk or not. Supervised training is used in order to learn from historical data, where the usage of permanent self-exclusions due to gambling related problems defines problem gamblers. The work consists of: obtain a local optimal configuration of the network which enhances the performance for identifying problem gam- blers who favour the casino section over sports section, and analyze the model to provide insights in the field. This project was carried out together with LeoVegas Mobile Gaming Group. The group offers both online casino games and sports booking in a number of countries in Europe. This collaboration made both data and expertise within the industry accessible to perform this work. The company currently have a model in production to perform these predictions, but want to explore other approaches. The model that has been developed showed a significant increase in performance compared to the one that is currently used at the company. Specifically, the precision and recall which are two metrics important for a two class classification model, increased by 37% and 21% respectively. Using raw time series data, instead of aggregated data increased the responsiveness regarding customers change in behaviour over time. The model also scaled better with more history compared to the current model, which could be a result of the nature of a recurrent network compared to the current model used.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)