Investigating minimal Convolution Neural Networks (CNNs) for realtime embedded eye feature detection

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Wei-hong Sung; [2020]

Keywords: ;

Abstract: With the rapid rise of neural networks, many tasks that used to be difficult to complete in traditional methods can now be solved well, especially in the computer vision field. However, as the tasks we have to solve have become more and more complex, the neural networks we use are becoming deeper and larger. Therefore, although some embedded systems are powerful nowadays, most embedded systems still suffer from memory and computation limitations, which means it is hard to deploy our large neural networks on these embedded devices. This project aims to explore different methods to compress the original large model. That is, we first train a baseline model, YOLOv3[1], which is a famous object detection network, and then we use two methods to compress the baseline model. The first method is pruning by using sparsity training, and we do channel pruning according to the scaling factor value after sparsity training. Based on the idea of this method, we have made three explorations. Firstly, we take the union mask strategy to solve the dimension problem of the shortcut-related layers in YOLOv3[1]. Secondly, we try to absorb the shifting factor information into subsequent layers. Finally, we implement the layer pruning and combine it with channel pruning. The second method is pruning by using Neural Architecture Search (NAS), which uses a deep reinforcement framework to automatically find the best compression ratio for each layer. At the end of this report, we analyze the key findings and conclusions of our experiment and purpose the future work which could potentially improve our project.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)