Cargando…

HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification

To effectively solve the problems that most convolutional neural networks cannot be applied to the pixelwise input in remote sensing (RS) classification and cannot adequately represent the spectral sequence information, we propose a new multispectral RS image classification framework called HyFormer...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Chuan, Fan, Xiangsuo, Fan, Jinlong, Yu, Ling, Wang, Nayi, Chen, Lin, Li, Xuyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9967485/
https://www.ncbi.nlm.nih.gov/pubmed/36833777
http://dx.doi.org/10.3390/ijerph20043059
_version_ 1784897276330639360
author Yan, Chuan
Fan, Xiangsuo
Fan, Jinlong
Yu, Ling
Wang, Nayi
Chen, Lin
Li, Xuyang
author_facet Yan, Chuan
Fan, Xiangsuo
Fan, Jinlong
Yu, Ling
Wang, Nayi
Chen, Lin
Li, Xuyang
author_sort Yan, Chuan
collection PubMed
description To effectively solve the problems that most convolutional neural networks cannot be applied to the pixelwise input in remote sensing (RS) classification and cannot adequately represent the spectral sequence information, we propose a new multispectral RS image classification framework called HyFormer based on Transformer. First, a network framework combining a fully connected layer (FC) and convolutional neural network (CNN) is designed, and the 1D pixelwise spectral sequences obtained from the fully connected layers are reshaped into a 3D spectral feature matrix for the input of CNN, which enhances the dimensionality of the features through FC as well as increasing the feature expressiveness, and can solve the problem that 2D CNN cannot achieve pixel-level classification. Secondly, the features of the three levels of CNN are extracted and combined with the linearly transformed spectral information to enhance the information expression capability, and also used as the input of the transformer encoder to improve the features of CNN using the powerful global modelling capability of the Transformer, and finally the skip connection of the adjacent encoders to enhance the fusion between different levels of information. The pixel classification results are obtained by MLP Head. In this paper, we mainly focus on the feature distribution in the eastern part of Changxing County and the central part of Nanxun District, Zhejiang Province, and conduct experiments based on Sentinel-2 multispectral RS images. The experimental results show that the overall accuracy of HyFormer for the study area classification in Changxing County is 95.37% and that of Transformer (ViT) is 94.15%. The experimental results show that the overall accuracy of HyFormer for the study area classification in Nanxun District is 95.4% and that of Transformer (ViT) is 94.69%, and the performance of HyFormer on the Sentinel-2 dataset is better than that of the Transformer.
format Online
Article
Text
id pubmed-9967485
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99674852023-02-27 HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification Yan, Chuan Fan, Xiangsuo Fan, Jinlong Yu, Ling Wang, Nayi Chen, Lin Li, Xuyang Int J Environ Res Public Health Article To effectively solve the problems that most convolutional neural networks cannot be applied to the pixelwise input in remote sensing (RS) classification and cannot adequately represent the spectral sequence information, we propose a new multispectral RS image classification framework called HyFormer based on Transformer. First, a network framework combining a fully connected layer (FC) and convolutional neural network (CNN) is designed, and the 1D pixelwise spectral sequences obtained from the fully connected layers are reshaped into a 3D spectral feature matrix for the input of CNN, which enhances the dimensionality of the features through FC as well as increasing the feature expressiveness, and can solve the problem that 2D CNN cannot achieve pixel-level classification. Secondly, the features of the three levels of CNN are extracted and combined with the linearly transformed spectral information to enhance the information expression capability, and also used as the input of the transformer encoder to improve the features of CNN using the powerful global modelling capability of the Transformer, and finally the skip connection of the adjacent encoders to enhance the fusion between different levels of information. The pixel classification results are obtained by MLP Head. In this paper, we mainly focus on the feature distribution in the eastern part of Changxing County and the central part of Nanxun District, Zhejiang Province, and conduct experiments based on Sentinel-2 multispectral RS images. The experimental results show that the overall accuracy of HyFormer for the study area classification in Changxing County is 95.37% and that of Transformer (ViT) is 94.15%. The experimental results show that the overall accuracy of HyFormer for the study area classification in Nanxun District is 95.4% and that of Transformer (ViT) is 94.69%, and the performance of HyFormer on the Sentinel-2 dataset is better than that of the Transformer. MDPI 2023-02-09 /pmc/articles/PMC9967485/ /pubmed/36833777 http://dx.doi.org/10.3390/ijerph20043059 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yan, Chuan
Fan, Xiangsuo
Fan, Jinlong
Yu, Ling
Wang, Nayi
Chen, Lin
Li, Xuyang
HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
title HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
title_full HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
title_fullStr HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
title_full_unstemmed HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
title_short HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
title_sort hyformer: hybrid transformer and cnn for pixel-level multispectral image land cover classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9967485/
https://www.ncbi.nlm.nih.gov/pubmed/36833777
http://dx.doi.org/10.3390/ijerph20043059
work_keys_str_mv AT yanchuan hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification
AT fanxiangsuo hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification
AT fanjinlong hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification
AT yuling hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification
AT wangnayi hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification
AT chenlin hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification
AT lixuyang hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification