Automatic Semantic Segmentation of Indoor Datasets

University essay from Blekinge Tekniska Högskola/Institutionen för datavetenskap

Abstract: Background: In recent years, computer vision has undergone significant advancements, revolutionizing fields such as robotics, augmented reality, and autonomoussystems. Key to this transformation is Simultaneous Localization and Mapping(SLAM), a fundamental technology that allows machines to navigate and interactintelligently with their surroundings. Challenges persist in harmonizing spatial andsemantic understanding, as conventional methods often treat these tasks separately,limiting comprehensive evaluations with shared datasets. As applications continueto evolve, the demand for accurate and efficient image segmentation ground truthbecomes paramount. Manual annotation, a traditional approach, proves to be bothcostly and resource-intensive, hindering the scalability of computer vision systems.This thesis addresses the urgent need for a cost-effective and scalable solution byfocusing on the creation of accurate and efficient image segmentation ground truth,bridging the gap between spatial and semantic tasks. Objective: This thesis addresses the challenge of creating an efficient image segmentation ground truth to complement datasets with spatial ground truth. Theprimary objective is to reduce the time and effort taken for annotation of datasets. Method: Our methodology adopts a systematic approach to evaluate and combineexisting annotation techniques, focusing on precise object detection and robust segmentation. By merging these approaches, we aim to enhance annotation accuracywhile streamlining the annotation process. This approach is systematically appliedand evaluated across multiple datasets, including the NYU V2 dataset(consists ofover 1449 images), ARID(real-world sequential dataset), and Italian flats(sequentialdataset created in blender). Results: The developed pipeline demonstrates promising outcomes, showcasing asubstantial reduction in annotation time compared to manual annotation, thereby addressing the challenges posed by the cost and resource intensiveness of the traditionalapproach. We observe that although not initially optimized for SLAM datasets, thepipeline performs exceptionally well on both ARID and Italian flats datasets, highlighting its adaptability to real-world scenarios. Conclusion: In conclusion, this research introduces an innovative annotation pipeline,offering a systematic and efficient approach to annotation. It tries to bridge the gapbetween spatial and semantic tasks, addressing the pressing need for comprehensiveannotation tools in this domain.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)