Hyperparameters relationship to the test accuracy of a convolutional neural network

University essay from Högskolan i Skövde/Institutionen för informationsteknologi

Abstract: Machine learning for image classification is a hot topic and it is increasing in popularity. Therefore the aim of this study is to provide a better understanding of convolutional neural network hyperparameters by comparing the test accuracy of convolutional neural network models with different hyperparameter value configurations. The focus of this study is to see whether there is an influence in the learning process depending on which hyperparameter values were used. For conducting the experiments convolutional neural network models were developed using the programming language Python utilizing the library Keras. The dataset used for this study iscifar-10, it includes 60000 colour images of 10 categories ranging from man-made objects to different animal species. Grid search is used for instantiating models with varying learning rate and momentum, width and depth values. Learning rate is only tested combined with momentum and width is only tested combined with depth. Activation functions, convolutional layers and batch size are tested individually. Grid search is compared against Bayesian optimization to see which technique will find the most optimized learning rate and momentum values. Results illustrate that the impact different hyperparameters have on the overall test accuracy varies. Learning rate and momentum affects the test accuracy greatly, however suboptimal values for learning rate and momentum can decrease the test accuracy severely. Activation function, width and depth, convolutional layer and batch size have a lesser impact on test accuracy. Regarding Bayesian optimization compared to grid search, results show that Bayesian optimization will not necessarily find more optimal hyperparameter values.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)