Cargando…

Structural Similarity Loss for Learning to Fuse Multi-Focus Images

Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘foc...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Xiang, Gilani, Syed Zulqarnain, Qin, Hanlin, Mian, Ajmal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7699701/
https://www.ncbi.nlm.nih.gov/pubmed/33233568
http://dx.doi.org/10.3390/s20226647
_version_ 1783616109530841088
author Yan, Xiang
Gilani, Syed Zulqarnain
Qin, Hanlin
Mian, Ajmal
author_facet Yan, Xiang
Gilani, Syed Zulqarnain
Qin, Hanlin
Mian, Ajmal
author_sort Yan, Xiang
collection PubMed
description Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘focused’ or ‘defocused’, and use the classified results to construct the fusion weight maps. This then necessitates a series of post-processing steps. In this paper, we present an end-to-end learning approach for directly predicting the fully focused output image from multi-focus input image pairs. The suggested approach uses a CNN architecture trained to perform fusion, without the need for ground truth fused images. The CNN exploits the image structural similarity (SSIM) to calculate the loss, a metric that is widely accepted for fused image quality evaluation. What is more, we also use the standard deviation of a local window of the image to automatically estimate the importance of the source images in the final fused image when designing the loss function. Our network can accept images of variable sizes and hence, we are able to utilize real benchmark datasets, instead of simulated ones, to train our network. The model is a feed-forward, fully convolutional neural network that can process images of variable sizes during test time. Extensive evaluation on benchmark datasets show that our method outperforms, or is comparable with, existing state-of-the-art techniques on both objective and subjective benchmarks.
format Online
Article
Text
id pubmed-7699701
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-76997012020-11-29 Structural Similarity Loss for Learning to Fuse Multi-Focus Images Yan, Xiang Gilani, Syed Zulqarnain Qin, Hanlin Mian, Ajmal Sensors (Basel) Article Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘focused’ or ‘defocused’, and use the classified results to construct the fusion weight maps. This then necessitates a series of post-processing steps. In this paper, we present an end-to-end learning approach for directly predicting the fully focused output image from multi-focus input image pairs. The suggested approach uses a CNN architecture trained to perform fusion, without the need for ground truth fused images. The CNN exploits the image structural similarity (SSIM) to calculate the loss, a metric that is widely accepted for fused image quality evaluation. What is more, we also use the standard deviation of a local window of the image to automatically estimate the importance of the source images in the final fused image when designing the loss function. Our network can accept images of variable sizes and hence, we are able to utilize real benchmark datasets, instead of simulated ones, to train our network. The model is a feed-forward, fully convolutional neural network that can process images of variable sizes during test time. Extensive evaluation on benchmark datasets show that our method outperforms, or is comparable with, existing state-of-the-art techniques on both objective and subjective benchmarks. MDPI 2020-11-20 /pmc/articles/PMC7699701/ /pubmed/33233568 http://dx.doi.org/10.3390/s20226647 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yan, Xiang
Gilani, Syed Zulqarnain
Qin, Hanlin
Mian, Ajmal
Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_full Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_fullStr Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_full_unstemmed Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_short Structural Similarity Loss for Learning to Fuse Multi-Focus Images
title_sort structural similarity loss for learning to fuse multi-focus images
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7699701/
https://www.ncbi.nlm.nih.gov/pubmed/33233568
http://dx.doi.org/10.3390/s20226647
work_keys_str_mv AT yanxiang structuralsimilaritylossforlearningtofusemultifocusimages
AT gilanisyedzulqarnain structuralsimilaritylossforlearningtofusemultifocusimages
AT qinhanlin structuralsimilaritylossforlearningtofusemultifocusimages
AT mianajmal structuralsimilaritylossforlearningtofusemultifocusimages