A Comparative Study of Black-box Optimization Algorithms for Tuning of Hyper-parameters in Deep Neural Networks

University essay from Luleå tekniska universitet/Institutionen för teknikvetenskap och matematik

Author: Skogby Steinholtz Olof; [2018]

Keywords: ;

Abstract: Deep neural networks (DNNs) have successfully been applied across various data intensive applications ranging from computer vision, language modeling, bioinformatics and search engines. Hyper-parameters of a DNN are defined as parameters that remain fixed during model training and heavily influence the DNN performance. Hence, regardless of application, the design-phase of constructing a DNN model becomes critical. Framing the selection and tuning of hyper-parameters as an expensive black-box optimization (BBO) problem, obstacles encountered in manual by-hand tuning could be addressed by taking instead an automated algorithmic approach. In this work, the following BBO algorithms: Nelder-Mead Algorithm (NM), ParticleSwarm Optmization (PSO), Bayesian Optimization with Gaussian Processes (BO-GP) and Tree-structured Parzen Estimator (TPE), are evaluated side-by-side for two hyper-parameter optimization problem instances. These instances are: Problem 1, incorporating a convolutionalneural network and Problem 2, incorporating a recurrent neural network. A simple Random Search (RS) algorithm acting as a baseline for performance comparison is also included in the experiments. Results in this work show that the TPE algorithm achieves the overall highest performance with respect to mean solution quality, speed ofimprovement and with a comparatively low trial-to-trial variability for both Problem 1 and Problem 2. The NM, PSO and BO-GP algorithms are shown capable of outperforming the RS baseline for Problem 1, but fails to do so in Problem 2.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)