Fraud or Not?

University essay from Uppsala universitet/Statistiska institutionen

Abstract: This paper uses statistical learning to examine and compare three different statistical methods with the aim to predict credit card fraud. The methods compared are Logistic Regression, K-Nearest Neighbour and Random Forest. They are applied and estimated on a data set consisting of nearly 300,000 credit card transactions to determine their performance using classification of fraud as the outcome variable. The three models all have different properties and advantages. The K-NN model preformed the best in this paper but has some disadvantages, since it does not explain the data but rather predict the outcome accurately. Random Forest explains the variables but performs less precise. The Logistic Regression model seems to be unfit for this specific data set.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)