Impact of Video Compression on the Performance of Object Detection Algorithms in Automotive Applications

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Kristian Kajak; [2020]

Keywords: ;

Abstract: The aim of this study is to generally expose the impact of using video compression methods on the performance accuracy of a neural network-based pedestrian detection system. Moreover, the emphasis is on investigating the consequences of using both compressed training and testing images with such neural network applications. This thesis includes a theoretical background into object detection and encoding systems, provides argumentative analysis for the detector, dataset, and evaluation method selection, and furthermore, describes how the selected detector is modified to be compatible with the selected dataset. The presented experiments use a modified MS-CNN pedestrian detection system on the CityScapes/CityPersons dataset. The Caltech benchmark evaluation method is used for comparing the detection performance of the detectors. The HEVC and JPEG 2000 encoding methods are used for data compression. This thesis reveals several interesting findings. For one, the results show that a significant amount of compression can be applied to the images before the detector’s performance starts degrading and that the detector is quite robust in dealing with compression artifacts to a certain extent. Secondly, peak signal-to-noise ratio of the data alone does not determine how the detector behaves, and other variables, such as the encoding method also affect the performance. Thirdly, the performance is most of all affected when detecting small-sized pedestrians (or pedestrians at a far distance). Fourthly, in terms of compressing training data, compared to a detector trained on non-compressed data, the detector trained solely on compressed images performs more accurate detections on lower quality data but performs less accurate detections on higher quality data. Furthermore, the results show that a detector trained on data consisting of both low and high quality variants of each frame beholds best detection performance on all quality scales.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)