Cargando…

Structural Similarity Loss for Learning to Fuse Multi-Focus Images

Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘foc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yan, Xiang, Gilani, Syed Zulqarnain, Qin, Hanlin, Mian, Ajmal
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7699701/ https://www.ncbi.nlm.nih.gov/pubmed/33233568 http://dx.doi.org/10.3390/s20226647

_version_	1783616109530841088
author	Yan, Xiang Gilani, Syed Zulqarnain Qin, Hanlin Mian, Ajmal
author_facet	Yan, Xiang Gilani, Syed Zulqarnain Qin, Hanlin Mian, Ajmal
author_sort	Yan, Xiang
collection	PubMed
description	Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘focused’ or ‘defocused’, and use the classified results to construct the fusion weight maps. This then necessitates a series of post-processing steps. In this paper, we present an end-to-end learning approach for directly predicting the fully focused output image from multi-focus input image pairs. The suggested approach uses a CNN architecture trained to perform fusion, without the need for ground truth fused images. The CNN exploits the image structural similarity (SSIM) to calculate the loss, a metric that is widely accepted for fused image quality evaluation. What is more, we also use the standard deviation of a local window of the image to automatically estimate the importance of the source images in the final fused image when designing the loss function. Our network can accept images of variable sizes and hence, we are able to utilize real benchmark datasets, instead of simulated ones, to train our network. The model is a feed-forward, fully convolutional neural network that can process images of variable sizes during test time. Extensive evaluation on benchmark datasets show that our method outperforms, or is comparable with, existing state-of-the-art techniques on both objective and subjective benchmarks.
format	Online Article Text
id	pubmed-7699701
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-76997012020-11-29 Structural Similarity Loss for Learning to Fuse Multi-Focus Images Yan, Xiang Gilani, Syed Zulqarnain Qin, Hanlin Mian, Ajmal Sensors (Basel) Article Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘focused’ or ‘defocused’, and use the classified results to construct the fusion weight maps. This then necessitates a series of post-processing steps. In this paper, we present an end-to-end learning approach for directly predicting the fully focused output image from multi-focus input image pairs. The suggested approach uses a CNN architecture trained to perform fusion, without the need for ground truth fused images. The CNN exploits the image structural similarity (SSIM) to calculate the loss, a metric that is widely accepted for fused image quality evaluation. What is more, we also use the standard deviation of a local window of the image to automatically estimate the importance of the source images in the final fused image when designing the loss function. Our network can accept images of variable sizes and hence, we are able to utilize real benchmark datasets, instead of simulated ones, to train our network. The model is a feed-forward, fully convolutional neural network that can process images of variable sizes during test time. Extensive evaluation on benchmark datasets show that our method outperforms, or is comparable with, existing state-of-the-art techniques on both objective and subjective benchmarks. MDPI 2020-11-20 /pmc/articles/PMC7699701/ /pubmed/33233568 http://dx.doi.org/10.3390/s20226647 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yan, Xiang Gilani, Syed Zulqarnain Qin, Hanlin Mian, Ajmal Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title	Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_full	Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_fullStr	Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_full_unstemmed	Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_short	Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_sort	structural similarity loss for learning to fuse multi-focus images
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7699701/ https://www.ncbi.nlm.nih.gov/pubmed/33233568 http://dx.doi.org/10.3390/s20226647
work_keys_str_mv	AT yanxiang structuralsimilaritylossforlearningtofusemultifocusimages AT gilanisyedzulqarnain structuralsimilaritylossforlearningtofusemultifocusimages AT qinhanlin structuralsimilaritylossforlearningtofusemultifocusimages AT mianajmal structuralsimilaritylossforlearningtofusemultifocusimages

Structural Similarity Loss for Learning to Fuse Multi-Focus Images

Ejemplares similares