Cargando…

Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images

Traditional pixel-based semantic segmentation methods for road extraction take each pixel as the recognition unit. Therefore, they are constrained by the restricted receptive field, in which pixels do not receive global road information. These phenomena greatly affect the accuracy of road extraction...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Chuan, Zhao, Huilin, Cui, Wei, He, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8537039/
https://www.ncbi.nlm.nih.gov/pubmed/34696086
http://dx.doi.org/10.3390/s21206873
_version_ 1784588152953896960
author Chen, Chuan
Zhao, Huilin
Cui, Wei
He, Xin
author_facet Chen, Chuan
Zhao, Huilin
Cui, Wei
He, Xin
author_sort Chen, Chuan
collection PubMed
description Traditional pixel-based semantic segmentation methods for road extraction take each pixel as the recognition unit. Therefore, they are constrained by the restricted receptive field, in which pixels do not receive global road information. These phenomena greatly affect the accuracy of road extraction. To improve the limited receptive field, a non-local neural network is generated to let each pixel receive global information. However, its spatial complexity is enormous, and this method will lead to considerable information redundancy in road extraction. To optimize the spatial complexity, the Crisscross Network (CCNet), with a crisscross shaped attention area, is applied. The key aspect of CCNet is the Crisscross Attention (CCA) module. Compared with non-local neural networks, CCNet can let each pixel only perceive the correlation information from horizontal and vertical directions. However, when using CCNet in road extraction of remote sensing (RS) images, the directionality of its attention area is insufficient, which is restricted to the horizontal and vertical direction. Due to the recurrent mechanism, the similarity of some pixel pairs in oblique directions cannot be calculated correctly and will be intensely dilated. To address the above problems, we propose a special attention module called the Dual Crisscross Attention (DCCA) module for road extraction, which consists of the CCA module, Rotated Crisscross Attention (RCCA) module and Self-adaptive Attention Fusion (SAF) module. The DCCA module is embedded into the Dual Crisscross Network (DCNet). In the CCA module and RCCA module, the similarities of pixel pairs are represented by an energy map. In order to remove the influence from the heterogeneous part, a heterogeneous filter function (HFF) is used to filter the energy map. Then the SAF module can distribute the weights of the CCA module and RCCA module according to the actual road shape. The DCCA module output is the fusion of the CCA module and RCCA module with the help of the SAF module, which can let pixels perceive local information and eight-direction non-local information. The geometric information of roads improves the accuracy of road extraction. The experimental results show that DCNet with the DCCA module improves the road IOU by 4.66% compared to CCNet with a single CCA module and 3.47% compared to CCNet with a single RCCA module.
format Online
Article
Text
id pubmed-8537039
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85370392021-10-24 Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images Chen, Chuan Zhao, Huilin Cui, Wei He, Xin Sensors (Basel) Article Traditional pixel-based semantic segmentation methods for road extraction take each pixel as the recognition unit. Therefore, they are constrained by the restricted receptive field, in which pixels do not receive global road information. These phenomena greatly affect the accuracy of road extraction. To improve the limited receptive field, a non-local neural network is generated to let each pixel receive global information. However, its spatial complexity is enormous, and this method will lead to considerable information redundancy in road extraction. To optimize the spatial complexity, the Crisscross Network (CCNet), with a crisscross shaped attention area, is applied. The key aspect of CCNet is the Crisscross Attention (CCA) module. Compared with non-local neural networks, CCNet can let each pixel only perceive the correlation information from horizontal and vertical directions. However, when using CCNet in road extraction of remote sensing (RS) images, the directionality of its attention area is insufficient, which is restricted to the horizontal and vertical direction. Due to the recurrent mechanism, the similarity of some pixel pairs in oblique directions cannot be calculated correctly and will be intensely dilated. To address the above problems, we propose a special attention module called the Dual Crisscross Attention (DCCA) module for road extraction, which consists of the CCA module, Rotated Crisscross Attention (RCCA) module and Self-adaptive Attention Fusion (SAF) module. The DCCA module is embedded into the Dual Crisscross Network (DCNet). In the CCA module and RCCA module, the similarities of pixel pairs are represented by an energy map. In order to remove the influence from the heterogeneous part, a heterogeneous filter function (HFF) is used to filter the energy map. Then the SAF module can distribute the weights of the CCA module and RCCA module according to the actual road shape. The DCCA module output is the fusion of the CCA module and RCCA module with the help of the SAF module, which can let pixels perceive local information and eight-direction non-local information. The geometric information of roads improves the accuracy of road extraction. The experimental results show that DCNet with the DCCA module improves the road IOU by 4.66% compared to CCNet with a single CCA module and 3.47% compared to CCNet with a single RCCA module. MDPI 2021-10-16 /pmc/articles/PMC8537039/ /pubmed/34696086 http://dx.doi.org/10.3390/s21206873 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chen, Chuan
Zhao, Huilin
Cui, Wei
He, Xin
Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images
title Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images
title_full Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images
title_fullStr Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images
title_full_unstemmed Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images
title_short Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images
title_sort dual crisscross attention module for road extraction from remote sensing images
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8537039/
https://www.ncbi.nlm.nih.gov/pubmed/34696086
http://dx.doi.org/10.3390/s21206873
work_keys_str_mv AT chenchuan dualcrisscrossattentionmoduleforroadextractionfromremotesensingimages
AT zhaohuilin dualcrisscrossattentionmoduleforroadextractionfromremotesensingimages
AT cuiwei dualcrisscrossattentionmoduleforroadextractionfromremotesensingimages
AT hexin dualcrisscrossattentionmoduleforroadextractionfromremotesensingimages