Predicting Subprime Customers' Probability of Default Using Transaction and Debt Data from NPLs

University essay from KTH/Matematisk statistik

Abstract: This thesis aims to predict the probability of default (PD) of non-performing loan (NPL) customers using transaction and debt data, as a part of developing credit scoring model for Hoist Finance. Many NPL customers face financial exclusion due to default and therefore are considered as bad customers. Hoist Finance is a company that manages NPLs and believes that not all conventionally considered subprime customers are high-risk customers and wants to offer them financial inclusion through favourable loans. In this thesis logistic regression was used to model the PD of NPL customers at Hoist Finance based on 12 months of data. Different feature selection (FS) methods were explored, and the best model utilized l1-regularization for FS and predicted with 85.71% accuracy that 6,277 out of 27,059 customers had a PD between 0% to 10%, which support this belief. Through analysis of the PD it was shown that the PD increased almost linearly with respect to an increase in either debt quantity, original total claim amount or number of missed payments. The analysis also showed that the payment behaviour in the last quarter had the most predictive power. At the same time, from analysing the type II error it was shown that the model was unable to capture some bad payment behaviour, due to putting to large emphasis on the last quarter.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)