Minimizing the expected opportunity loss by optimizing the ordering of shipping methods in e-Commerce using Machine Learning

University essay from KTH/Matematik (Avd.)

Abstract: The shopping industry is rapidly changing as the technology is advancing. This is especially true for the online industry where consumers are nowadays able to to shop much of what the need over the internet. In order to make the shopping experience as smooth as possible, different companies develops their sites and checkouts to be as friction-less as possible. In this thesis, the shipping module of Klarnas checkout was analyzed and different models were created to get an understanding of how the likelihood of a customer finalizing a purchase (conversion rate) could be improved. The shipping module consists of a number of shipping methods along with shipping carriers. Currently, there is no logic to sort the different shipping method/carriers other than a static ordering for all customers. The order of the shipping methods and carriers are what were investigated in the thesis. Hence, the core problem is to understand how the opportunity loss could be minimized by a different ordering of the shipping methods, where the opportunity loss are derived by the reduction in conversion rate between the control group (current setup) and a new model. To achieve this, a dataset was prepared and features were engineered in such a way that the same training and test datasets could be used in all algorithms. The features were engineered using a point-in-time concept so that no target leakage would be present. The target that was used was a plain concatenation of shipping method plus the shipping carrier. Finally, three different methods tackling this multiclass classification problem were investigated, namely Logistic Regression, Extreme Gradient Boosting and Artificial Neural Network. The aim of these algorithms is to create a learner that has been trained on a given dataset and that is able to predict the combination of shipping method plus carrier given a certain set of features. By the end of the investigation, it was concluded that using a model to predict the most relevant shipping method (plus carrier) for the customer made a positive difference on the conversion rate and in turn, the increase in sales. The overall accuracy of the Logistic Regression was 65.09%, 71.61% for the Extreme Gradient Boosting and 70.88% for the Artificial Neural Network. Once the models were trained, they were used in a back-simulation (that would be a proxy for an A/B-test) on a validation set to see the effect on the conversion rate. Here, the results showed that the conversion rate was 84.85% for the Logistic Regression model, 84.95% for the Extreme Gradient Boosting and 85.02% for the Artificial Neural Network. The control group which was a random sample of the current logic had a conversion rate of 84.21%. Thus, implementing the Artificial Neural Network would increase Klarnas sales by about 6.5 SEK per session.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)