Efficiency of CNN on Heterogeneous Processing Devices
Abstract: In the development of advanced driver assistance systems, computer vision problemsneed to be optimized to run efficiently on embedded platforms. Convolutional neural network(CNN) accelerators have proven to be very efficient for embedded camera platforms,such as the ones used for automotive vision systems. Therefore, the focus of this thesisis to evaluate the efficiency of a CNN on a future embedded heterogeneous processingdevice. The memory size in an embedded system is often very limited, and it is necessary todivide the input into multiple tiles. In addition, there are power and speed constraintsthat needs to be met to be able to use a computer vision system in a car. To increaseefficiency and optimize the memory usage, different methods for CNN layer fusion areproposed and evaluated for a variety of tile sizes. Several different layer fusion methods and input tile sizes are chosen as optimal solutions,depending on the depth of the layers in the CNN. The solutions investigated inthe thesis are most efficient for deep CNN layers, where the number of channels is high.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)