Cargando…

Multi-Modality Image Fusion and Object Detection Based on Semantic Information

Infrared and visible image fusion (IVIF) aims to provide informative images by combining complementary information from different sensors. Existing IVIF methods based on deep learning focus on strengthening the network with increasing depth but often ignore the importance of transmission characteris...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Yong, Zhou, Xin, Zhong, Wei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216995/ https://www.ncbi.nlm.nih.gov/pubmed/37238472 http://dx.doi.org/10.3390/e25050718

_version_	1785048429771096064
author	Liu, Yong Zhou, Xin Zhong, Wei
author_facet	Liu, Yong Zhou, Xin Zhong, Wei
author_sort	Liu, Yong
collection	PubMed
description	Infrared and visible image fusion (IVIF) aims to provide informative images by combining complementary information from different sensors. Existing IVIF methods based on deep learning focus on strengthening the network with increasing depth but often ignore the importance of transmission characteristics, resulting in the degradation of important information. In addition, while many methods use various loss functions or fusion rules to retain complementary features of both modes, the fusion results often retain redundant or even invalid information.In order to accurately extract the effective information from both infrared images and visible light images without omission or redundancy, and to better serve downstream tasks such as target detection with the fused image, we propose a multi-level structure search attention fusion network based on semantic information guidance, which realizes the fusion of infrared and visible images in an end-to-end way. Our network has two main contributions: the use of neural architecture search (NAS) and the newly designed multilevel adaptive attention module (MAAB). These methods enable our network to retain the typical characteristics of the two modes while removing useless information for the detection task in the fusion results. In addition, our loss function and joint training method can establish a reliable relationship between the fusion network and subsequent detection tasks. Extensive experiments on the new dataset (M3FD) show that our fusion method has achieved advanced performance in both subjective and objective evaluations, and the mAP in the object detection task is improved by 0.5% compared to the second-best method (FusionGAN).
format	Online Article Text
id	pubmed-10216995
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-102169952023-05-27 Multi-Modality Image Fusion and Object Detection Based on Semantic Information Liu, Yong Zhou, Xin Zhong, Wei Entropy (Basel) Article Infrared and visible image fusion (IVIF) aims to provide informative images by combining complementary information from different sensors. Existing IVIF methods based on deep learning focus on strengthening the network with increasing depth but often ignore the importance of transmission characteristics, resulting in the degradation of important information. In addition, while many methods use various loss functions or fusion rules to retain complementary features of both modes, the fusion results often retain redundant or even invalid information.In order to accurately extract the effective information from both infrared images and visible light images without omission or redundancy, and to better serve downstream tasks such as target detection with the fused image, we propose a multi-level structure search attention fusion network based on semantic information guidance, which realizes the fusion of infrared and visible images in an end-to-end way. Our network has two main contributions: the use of neural architecture search (NAS) and the newly designed multilevel adaptive attention module (MAAB). These methods enable our network to retain the typical characteristics of the two modes while removing useless information for the detection task in the fusion results. In addition, our loss function and joint training method can establish a reliable relationship between the fusion network and subsequent detection tasks. Extensive experiments on the new dataset (M3FD) show that our fusion method has achieved advanced performance in both subjective and objective evaluations, and the mAP in the object detection task is improved by 0.5% compared to the second-best method (FusionGAN). MDPI 2023-04-26 /pmc/articles/PMC10216995/ /pubmed/37238472 http://dx.doi.org/10.3390/e25050718 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Liu, Yong Zhou, Xin Zhong, Wei Multi-Modality Image Fusion and Object Detection Based on Semantic Information
title	Multi-Modality Image Fusion and Object Detection Based on Semantic Information
title_full	Multi-Modality Image Fusion and Object Detection Based on Semantic Information
title_fullStr	Multi-Modality Image Fusion and Object Detection Based on Semantic Information
title_full_unstemmed	Multi-Modality Image Fusion and Object Detection Based on Semantic Information
title_short	Multi-Modality Image Fusion and Object Detection Based on Semantic Information
title_sort	multi-modality image fusion and object detection based on semantic information
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216995/ https://www.ncbi.nlm.nih.gov/pubmed/37238472 http://dx.doi.org/10.3390/e25050718
work_keys_str_mv	AT liuyong multimodalityimagefusionandobjectdetectionbasedonsemanticinformation AT zhouxin multimodalityimagefusionandobjectdetectionbasedonsemanticinformation AT zhongwei multimodalityimagefusionandobjectdetectionbasedonsemanticinformation

Multi-Modality Image Fusion and Object Detection Based on Semantic Information

Ejemplares similares