Memory Efficient Semantic Segmentation for Embedded Systems

University essay from Lunds universitet/Institutionen för datavetenskap

Abstract: Convolutional neural networks (CNNs) have made rapid progress in the last years and in fields, such as computer vision, they are considered state-of-the-art. However, CNNs are very computationally intensive. This makes them challenging to use in embedded devices such as smartphones, security cameras and cars. This thesis investigates different neural network compression techniques to see which will result in least memory consumption with the least accuracy drop. The techniques tested are pruning (selectively removing parts of the network), quantization (storing network parameters with lower accuracy) and tiling (splitting the dataflow inside the network). The compression techniques are tested on DeepLab v3+, a neural network for semantic segmentation. Compared to other papers in neural network compression, our baseline, DeepLab, has significantly less parameters than the baseline used in other papers. We then selected a compressed version of DeepLab and tested it on an Axis P3227-LV network camera with two different implementations. One with TensorFlow Lite and the second one with a custom implementation of DeepLab written from scratch utilizing a novel memory allocation algorithm. We find that memory usage is reduced one third by pruning, one half with quantization and almost two thirds with our custom implementation. In total, when combining all tested compression techniques with our custom implementation, we managed to reduce the memory consumption from 170 MB (TensorFlow Lite) down to 20 MB with only a minor reduction in accuracy.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)