The Impact of Deep Neural Network Pruning on the Hyperparameter Performance Space: An Empirical Study

University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

Abstract: With the continued growth of deep learning models in terms of size and computational requirements, the need for efficient models for deployment on resource-constrained devices becomes crucial. Structured pruning has emerged as a proven method to speed up models and reduce computational requirements. Structured pruning involves removing filters, channels, or groups of operations from a network, effectively modifying its architecture. Since the optimal hyperparameters of a model are tightly coupled to its architecture, it is unclear how pruning affects the choice of hyperparameters. To answer this question, we investigate the impact of deep neural network pruning on the hyperparameter performance space. In this work, we perform a series of experiments on popular classification models, ResNet-56, MobileNetV2, and ResNet-50, using CIFAR-10 and ImageNet datasets. We examine the effect of uniform and non-uniform structured magnitude pruning on the learning rate and weight decay. Specifically, we explore how pruning affects their relationship and the risk associated with not tuning these hyperparameters after pruning. The experiments reveal that pruning does not have a significant impact on the learning rate and weight decay, suggesting that extensive hyperparameter tuning after pruning may not be crucial for optimal performance. Overall, this study provides insights into the complex dynamics between pruning, model performance, and optimal hyperparameters. The findings give guidance for optimising and fine-tuning pruned models and contribute to advancing model compression and hyperparameter tuning, highlighting the interplay between model architecture and hyperparameters.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)