Cargando…
HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification
To effectively solve the problems that most convolutional neural networks cannot be applied to the pixelwise input in remote sensing (RS) classification and cannot adequately represent the spectral sequence information, we propose a new multispectral RS image classification framework called HyFormer...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9967485/ https://www.ncbi.nlm.nih.gov/pubmed/36833777 http://dx.doi.org/10.3390/ijerph20043059 |
_version_ | 1784897276330639360 |
---|---|
author | Yan, Chuan Fan, Xiangsuo Fan, Jinlong Yu, Ling Wang, Nayi Chen, Lin Li, Xuyang |
author_facet | Yan, Chuan Fan, Xiangsuo Fan, Jinlong Yu, Ling Wang, Nayi Chen, Lin Li, Xuyang |
author_sort | Yan, Chuan |
collection | PubMed |
description | To effectively solve the problems that most convolutional neural networks cannot be applied to the pixelwise input in remote sensing (RS) classification and cannot adequately represent the spectral sequence information, we propose a new multispectral RS image classification framework called HyFormer based on Transformer. First, a network framework combining a fully connected layer (FC) and convolutional neural network (CNN) is designed, and the 1D pixelwise spectral sequences obtained from the fully connected layers are reshaped into a 3D spectral feature matrix for the input of CNN, which enhances the dimensionality of the features through FC as well as increasing the feature expressiveness, and can solve the problem that 2D CNN cannot achieve pixel-level classification. Secondly, the features of the three levels of CNN are extracted and combined with the linearly transformed spectral information to enhance the information expression capability, and also used as the input of the transformer encoder to improve the features of CNN using the powerful global modelling capability of the Transformer, and finally the skip connection of the adjacent encoders to enhance the fusion between different levels of information. The pixel classification results are obtained by MLP Head. In this paper, we mainly focus on the feature distribution in the eastern part of Changxing County and the central part of Nanxun District, Zhejiang Province, and conduct experiments based on Sentinel-2 multispectral RS images. The experimental results show that the overall accuracy of HyFormer for the study area classification in Changxing County is 95.37% and that of Transformer (ViT) is 94.15%. The experimental results show that the overall accuracy of HyFormer for the study area classification in Nanxun District is 95.4% and that of Transformer (ViT) is 94.69%, and the performance of HyFormer on the Sentinel-2 dataset is better than that of the Transformer. |
format | Online Article Text |
id | pubmed-9967485 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-99674852023-02-27 HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification Yan, Chuan Fan, Xiangsuo Fan, Jinlong Yu, Ling Wang, Nayi Chen, Lin Li, Xuyang Int J Environ Res Public Health Article To effectively solve the problems that most convolutional neural networks cannot be applied to the pixelwise input in remote sensing (RS) classification and cannot adequately represent the spectral sequence information, we propose a new multispectral RS image classification framework called HyFormer based on Transformer. First, a network framework combining a fully connected layer (FC) and convolutional neural network (CNN) is designed, and the 1D pixelwise spectral sequences obtained from the fully connected layers are reshaped into a 3D spectral feature matrix for the input of CNN, which enhances the dimensionality of the features through FC as well as increasing the feature expressiveness, and can solve the problem that 2D CNN cannot achieve pixel-level classification. Secondly, the features of the three levels of CNN are extracted and combined with the linearly transformed spectral information to enhance the information expression capability, and also used as the input of the transformer encoder to improve the features of CNN using the powerful global modelling capability of the Transformer, and finally the skip connection of the adjacent encoders to enhance the fusion between different levels of information. The pixel classification results are obtained by MLP Head. In this paper, we mainly focus on the feature distribution in the eastern part of Changxing County and the central part of Nanxun District, Zhejiang Province, and conduct experiments based on Sentinel-2 multispectral RS images. The experimental results show that the overall accuracy of HyFormer for the study area classification in Changxing County is 95.37% and that of Transformer (ViT) is 94.15%. The experimental results show that the overall accuracy of HyFormer for the study area classification in Nanxun District is 95.4% and that of Transformer (ViT) is 94.69%, and the performance of HyFormer on the Sentinel-2 dataset is better than that of the Transformer. MDPI 2023-02-09 /pmc/articles/PMC9967485/ /pubmed/36833777 http://dx.doi.org/10.3390/ijerph20043059 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Yan, Chuan Fan, Xiangsuo Fan, Jinlong Yu, Ling Wang, Nayi Chen, Lin Li, Xuyang HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification |
title | HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification |
title_full | HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification |
title_fullStr | HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification |
title_full_unstemmed | HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification |
title_short | HyFormer: Hybrid Transformer and CNN for Pixel-Level Multispectral Image Land Cover Classification |
title_sort | hyformer: hybrid transformer and cnn for pixel-level multispectral image land cover classification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9967485/ https://www.ncbi.nlm.nih.gov/pubmed/36833777 http://dx.doi.org/10.3390/ijerph20043059 |
work_keys_str_mv | AT yanchuan hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification AT fanxiangsuo hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification AT fanjinlong hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification AT yuling hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification AT wangnayi hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification AT chenlin hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification AT lixuyang hyformerhybridtransformerandcnnforpixellevelmultispectralimagelandcoverclassification |