The impact of distance, feature weighting and selection for KNN in credit default prediction

University essay from Högskolan i Skövde/Institutionen för informationsteknologi

Author: Huicheng Zhang; [2020]

Keywords: ;

Abstract: With the rapid spread of credit card business around the world, credit risk has also expanded dramatically. The occurrence of a large number of credit cardcustomer defaults has caused huge losses to financial institutions such as banks. Therefore, it is particularly important to accurately identify default customers. We investigate the use of the K Nearest Neighbor (KNN) algorithm, to evaluate the impact of the alternative distance functions, feature weighting, and feature selection on the accuracy and the area under curve (AUC) of the credit card default prediction model. For our evaluation, we use a credit card user dataset from Taiwan. We find that the Mahalanobis distance function performed best, feature weighting, and feature selection could improve the accuracy and AUC of the model.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)