Evaluating feature selection in a marketing classification problem

University essay from Linnéuniversitetet/Institutionen för datavetenskap (DV)

Abstract: Nowadays machine learning is becoming more popular in prediction andclassification tasks for many fields. In banks, telemarketing area is usingthis approach by gathering information from phone calls made to clientsover the past campaigns. The true fact is that sometimes phone calls areannoying and time consuming for both parts, the marketing department andthe client. This is why this project is intended to prove that feature selectioncould improve machine learning models.  A Portuguese bank gathered data regarding phone calls and clientsstatistics information like their actual jobs, salaries and employment statusto determine the probabilities if a person would buy the offered productand/or service. C4.5 decision tree (J48) and multilayer perceptron (MLP)are the machine learning models to be used for the experiments. For featureselection correlation-based feature selection (Cfs), Chi-squared attributeselection and RELIEF attribute selection algorithms will be used. WEKAframework will provide the tools to test and implement the experimentscarried out in this research.  The results were very close over the two data mining models with aslight improvement by C4.5 over the correct classifications and MLP onROC curve rate. With these results it was confirmed that feature selectionimproves classification and/or prediction results.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)