Towards Large Scale Façade Parsing: A Deep Learning Pipeline Using Mask R-CNN

University essay from Göteborgs universitet/Institutionen för data- och informationsteknik

Abstract: This thesis tries to find a methodology that create a working pipeline for building facade parsing, which allows to access large scale panorama imagery from Google Street View (GSV) and implement on deep learning models. We propose a semiautomated pipeline that integrates multiple systems for large-scale building facade parsing. One of the aim of pipeline is to alleviate the limitations of using street-level panorama images for deep learning application through using rectilinear projection. The rectilinear projection is used to transform high-resolution 360 view street level panorama imagery to series of normal perspective images. The other aim of the pipeline is to automatically retrieve large scale panorama imagery from GSV and implement the Mask R-CNN deep learning model to building facade parsing. The pipeline process includes i) Street level panorama imagery extraction from GSV and Mapillary, ii) Apply rectilinear projection on panorama imagery to convert into a series of images, iii) Retrieve building footprint and identify building facade images in the map, iv) Image pre-processing, v) Implement and evaluate the Mask R-CNN model to detect building facade classes, vi) Generate inference, such as - the window-to-wall ratio of the detected classes, and vii) Test with other external procedural building systems and depict the result. In this thesis work, we have tried to use different frameworks and tools along with deep learning models for large-scale building facade parsing. The thesis discusses about the methodology and technique, implementation, and experiments to develop the pipeline based on Google Street View panorama imagery and Mask R-CNN. As a result, we designed a semi-automated pipeline that comprises the abovementioned steps and processes. While developing the pipeline, we explored a wide range of topics and integrated a variety of tools, frameworks, and algorithms. Using the proposed semi-automated approach, we were able to generate datasets and train them on a deep learning model, resulting in significant inferences that will serve as the foundation for the future development of the domain.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)