Fine-tuning of Fully Convolutional Networks for Vehicle Detection in Satellite Images: Data Augmentation and Hard Examples Mining

University essay from KTH/Geoinformatik

Author: Julie Imbert; [2019]

Keywords: vehicle; deep learning; satellite; image;

Abstract: Earth observation satellites, both from private companies and governmental agencies, allow us to take a deep look at the Earth from above. Images are acquired with shorter revisit times and spatial resolutions offering new details. Nevertheless, collecting the data is only half of the work that has to be done. With an increase of the amount and quality of satellite data available, developing processing methods to exploit them efficiently and rapidly is primary. On images at around 30 cm resolution, vehicles can be well detected but remain considered as small objects. Even the best models can still have false alarms and missed detections. In this master thesis, the aim is to improve vehicle detection with already high performing networks, by only using their initial training dataset. Worldview-3 satellite images are used for this work considering that they have a resolution close to 30 cm. Areas all over the world are considered. A custom U-Net convolutional network is trained for vehicle detection on two different datasets. When such an architecture is designed carefully with state of the art methods and then trained with the right parameters on a relevant and big enough dataset, very high scores can be reached. For this reason, improving such a network, once it has been trained on all the data available, in order to grab the last score points, is a real challenge. The performances of the U-Net are analysed both on a test set and on its own training dataset. From the performances on the test set, specific data augmentations are chosen to improve the network with a small fine-tuning training. This method allows to improve the network on its own specific weaknesses. It is an efficient way to avoid using directly numerous data augmentations that would not all be necessary and would increase training times. From the performances on the training dataset, examples where the network failed to learn are identified. Missed vehicles and false alarms are then used to design new datasets on which the network is fine-tuned in order to improve it and to reduce these types of mistakes on the test set. These fine-tuning trainings are performed with adapted parameters to avoid catastrophic forgetting. The aim is to focus the networks fine-tuning on false positives or false negatives, in order to allow it to learn features that it might have missed during the first training. Using data augmentation as a fine-tuning method allowed to increase the performances of a model. A gain close to 2.57 points in F1-score has been obtained with a specific augmentation. The hard mining strategy yielded more variable results. In the best case an improvement of 1.4 in F1-score has been observed. The method allowed to orientate the network to improve either recall or precision, while a deterioration of the respectively other metric was observed. An improvement of the both metrics simultaneously has not been reached.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)