Indoor scene verification : Evaluation of indoor scene representations for the purpose of location verification

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: When human’s visual system is looking at two pictures taken in some indoor location, it is fairly easy to tell whether they were taken in exactly the same place, even when the location has never been visited in reality. It is possible due to being able to pay attention to the multiple factors such as spatial properties (windows shape, room shape), common patterns (floor, walls) or presence of specific objects (furniture, lighting). Changes in camera pose, illumination, furniture location or digital alteration of the image (e.g. watermarks) has little influence on this ability. Traditional approaches to measuring the perceptual similarity of images struggled to reproduce this skill. This thesis defines the Indoor scene verification (ISV) problem as distinguishing whether two indoor scene images were taken in the same indoor space or not. It explores the capabilities of state-of-the-art perceptual similarity metrics by introducing two new datasets designed specifically for this problem. Perceptual hashing, ORB, FaceNet and NetVLAD are evaluated as the baseline candidates. The results show that NetVLAD provides the best results on both datasets and therefore is chosen as the baseline for the experiments aiming to improve it. Three of them are carried out testing the impact of using the different training dataset, changing deep neural network architecture and introducing new loss function. Quantitative analysis of AUC score shows that switching from VGG16 to MobileNetV2 allows for improvement over the baseline. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)