Unsupervised learning with mixed type data : for detecting money laundering

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: The purpose of this master's thesis is to perform a cluster analysis on parts of Handelsbanken's customer database. The ambition is to explore if this could be of aid in identifying type customers within risk of illegal activities such as money laundering. A literature study is conducted to help determine which of the clustering methods described in the literature are most suitable for the current problem. The most important constraints of the problem are that the data consists of mixed type attributes (categorical and numerical) and the large presence of outliers in the data. An extension to the self-organising map as well as the k-prototypes algorithms were chosen for the clustering. It is concluded that clusters exist in the data, however in the presence of outliers. More work is needed on handling missing values in the dataset.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)