Clustering and classification of prepaid mortgages

University essay from KTH/Matematik (Avd.)

Abstract: This thesis aims to cluster and classify mortgages issued by a financial institution. The aim is to apply machine learning techniques on historical data in order to discover a possible structure and predictability in prepaid mortgages. To discover the underlying structure of the data \textit{k}-means clustering on principal components is performed to cluster customers with mortgages.A logistic regression model is trained to predict how likely (future) customers with mortgages are to prepay their loans, hence moving them to another institution. The classification model is evaluated using confusion matrices for different levels of thresholds. The results show that based on historical data the model detects clusters which include a higher proportion of mortgages being prepaid. This indicating an underlying structure which can be used to determine a riskiness of leaving for customers within each cluster. The results from the logistic regression show a significant improvement in precision by using a high threshold in the classification.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)