Pedestrian Detection on Dewarped Fisheye Images using Deep Neural Networks

University essay from

Abstract: In the field of autonomous vehicles, Advanced Driver Assistance Systems (ADAS)play a key role. Their applications vary from aiding with critical safety systems to assisting with trivial parking scenarios. To optimize the use of resources, trivial ADAS applications are often limited to make use of low-cost sensors. As a result, sensors such as Cameras and UltraSonics are preferred over LiDAR (Light Detection and Ranging) and RADAR (RAdio Detection And Ranging) in assisting the driver with parking. In a parking scenario, to ensure the safety of people in and around the car, the sensors need to detect objects around the car in real-time. With the advancements in Deep Learning, Deep Neural Networks (DNN) are becoming increasingly effective in detecting objects with real-time performance. Therefore, the thesis aims to investigate the viability of Deep Neural Networks using Fisheye cameras to detect pedestrians around the car. To achieve the objective, an experiment was conducted on a test vehicle equipped with multiple Fisheye cameras. Three Deep Neural Networks namely, YOLOv3 (You Only Look Once), its faster variant Tiny-YOLOv3 ND ResNet-50 were chosen to detect pedestrians. The Networks were trained on Fisheye image dataset with the help of transfer learning. After training, the models were also compared to pre-trained models that were trained to detect pedestrians on normal images. Our experiments have shown that the YOLOv3 variants have performed well but with a difficulty of localizing the pedestrians. The ResNet model has failed to generate acceptable detections and thus performed poorly. The three models produced detections with a real-time performance for a single camera but when scaled to multiple cameras, the detection speed was not on par. The YOLOv3 variants could detect pedestrians successfully on dewarped fish-eye images but the pipeline still needs a better dewarping algorithm to lessen the distortion effects. Further, the models need to be optimized in order to generate detections with real-time performance on multiple cameras and also to fit the model on an embedded system.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)