Convolutional Neural Network Quantisation for Accelerating Inference in Visual Embedded Systems

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Andrea Leopardi; [2018]

Keywords: ;

Abstract: The progress within the field of Deep Learning has enabled the realisationof many Computer Vision tasks with a high level of accuracy.In order to attain such achievements, neural networks have been designedgradually with deeper architectures. While this developmentapproach has lead to outperforming complex tasks, it has also entailedan incremental demand for high computational resources. Oppositely,it has emerged at the same time the need of performing Deep Learningtechniques on limited-resource devices, requiring local computingand ability of processing at the edge.Network reduction represents a feasible approach for addressing thisissue and enhancing Deep Learning performances on embedded systemsand low-resources devices. In particular, network quantisation ishere proposed as a versatile and effective method of network reduction.It allows the approximation of neural networks parameters andprocessing units, accelerating the execution time with no significantlosses in accuracy. Algorithms of network quantisation result complementaryto other network reduction techniques and can be applied ontop of already-designed models.The study of network quantisation in this work has to be inserted aspart of a project consisting in the development of a visual embeddedsystem within the field of Advanced Driver-Assistance Systems. Thisproject exploits Deep Learning for performing object detection operationsin real-time. Furthermore, quantisation is adopted to acceleratethe inferring of neural networks specifically on Zynq MPSoC platforms.Evaluations of different quantisation algorithms employable for thiswork have led to the selection of a model designed on hardware: wehave reproduced such a model outside its native framework and analysedit in respect to other models and on different platforms. Theachieved performances prove the validity of quantisation as networkreduction technique for an inference acceleration. In particular, quantisationresults very effective within embedded systems for its handlingof integer values instead of floating points and its suitability for improvedhardware designs.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)