Evaluation of hyperparameter optimization methods for Random Forest classifiers

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Rasmus Nygren; Aleks Petkov; [2021]

Keywords: ;

Abstract: In order to create a machine learning model, one is often tasked with selecting certain hyperparameters which configure the behavior of the model. The performance of the model can vary greatly depending on how these hyperparameters are selected, thus making it relevant to investigate the effects of hyperparameter optimization on the classification accuracy of a machine learning model. In this study, we train and evaluate a Random Forest classifier whose hyperparameters are set to default values and compare its classification accuracy to another classifier whose hyperparameters are obtained through the use of the hyperparameter optimization (HPO) methods Random Search, Bayesian Optimization and Particle Swarm Optimization. This is done on three different datasets, and each HPO method is evaluated based on the classification accuracy change it induces across the datasets. We found that every HPO method yielded a total classification accuracy increase of approximately 2-3% across all datasets compared to the accuracies obtained using the default hyperparameters. However, due to limitations of time, data and computational resources, no assertions can be made as to whether the observed positive effect is generalizable at a larger scale. Instead, we could conclude that the utility of HPO methods is dependent on the dataset at hand. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)