A real-time Multi-modal fusion model for visible and infrared images : A light-weight and real-time CNN-based fusion model for visible and infrared images in surveillance

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Infrared images could highlight the semantic areas like pedestrians and be robust to luminance changes, while visible images provide abundant background details and good visual effects. Multi-modal image fusion for surveillance application aims to generate an informative fused images from two source images real-time, so as to facilitate surveillance observatory or object detection tasks. In this work, we firstly investigate conventional methods like multi-scale transform-based methods and subspace-based methods, and deep learning-based methods like AE, CNN and GAN in details. After fully discussion of their advantages and disadvantages, CNN-based methods are chosen due to their robustness and end-to-end feature. A novel real-time CNN-based model is proposed with optimized model architecture and loss functions. The model is based on Dense net structure to reuse the previous features, but the number of layers and the depth are extremely optimized, so as to improve the fusion efficiency. The size of the feature maps keeps the same to avoid information losses. The loss function includes pixel intensity loss, gradient loss and decompose loss. The intensity and gradient loss use the maximum strategy to keep the highlighted semantic areas, and the decompose loss is to compare the reconstructed images and source images, so as to push the fusion model maintain more features. The model is trained on MSRS dataset, and evaluate on the LLVIP, MSRS and TNO datasets with other 9 state-of-the-art algorithms qualitatively and quantitatively. The good visual effect of our proposed model and the outstanding comparison results on 10 evaluation metrics comprehensively and objectively proves its good fusion ability.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)