Instance Segmentation on depth images using Swin Transformer for improved accuracy on indoor images

University essay from Linköpings universitet/Artificiell intelligens och integrerade datorsystem

Author: Alfred Hagberg; Mustaf Abdullahi Musse; [2022]

Keywords: Instance Segmentation; segmentation; deep learning; semantic segmentation; swin transformer; mask rcnn; rcnn; cascade mask rcnn; slam; simultaneous localization and mapping; object detection; COCO; NYU dataset; vision transformer;

Abstract: The Simultaneous Localisation And Mapping (SLAM) problem is an open fundamental problem in autonomous mobile robotics. One of the latest most researched techniques used to enhance the SLAM methods is instance segmentation. In this thesis, we implement an instance segmentation system using Swin Transformer combined with two of the state of the art methods of instance segmentation namely Cascade Mask RCNN and Mask RCNN. Instance segmentation is a technique that simultaneously solves the problem of object detection and semantic segmentation. We show that depth information enhances the average precision (AP) by approximately 7%. We also show that the Swin Transformer backbone model can work well with depth images. Our results also show that Cascade Mask RCNN outperforms Mask RCNN. However, the results are to be considered due to the small size of the NYU-depth v2 dataset. Most of the instance segmentation researches use the COCO dataset which has a hundred times more images than the NYU-depth v2 dataset but it does not have the depth information of the image.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Instance Segmentation on depth images using Swin Transformer for improved accuracy on indoor images

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-22)