Cargando…

Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery

It is essential for researchers to have a proper interpretation of remote sensing images (RSIs) and precise semantic labeling of their component parts. Although FCN (Fully Convolutional Networks)-like deep convolutional network architectures have been widely applied in the perception of autonomous c...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Weipeng, Qin, Wenhu, Yun, Zhonghua, Ping, Peng, Wu, Kaiyang, Qu, Yuke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8002143/
https://www.ncbi.nlm.nih.gov/pubmed/33799737
http://dx.doi.org/10.3390/s21061983
_version_ 1783671394769305600
author Shi, Weipeng
Qin, Wenhu
Yun, Zhonghua
Ping, Peng
Wu, Kaiyang
Qu, Yuke
author_facet Shi, Weipeng
Qin, Wenhu
Yun, Zhonghua
Ping, Peng
Wu, Kaiyang
Qu, Yuke
author_sort Shi, Weipeng
collection PubMed
description It is essential for researchers to have a proper interpretation of remote sensing images (RSIs) and precise semantic labeling of their component parts. Although FCN (Fully Convolutional Networks)-like deep convolutional network architectures have been widely applied in the perception of autonomous cars, there are still two challenges in the semantic segmentation of RSIs. The first is to identify details in high-resolution images with complex scenes and to solve the class-mismatch issues; the second is to capture the edge of objects finely without being confused by the surroundings. HRNET has the characteristics of maintaining high-resolution representation by fusing feature information with parallel multi-resolution convolution branches. We adopt HRNET as a backbone and propose to incorporate the Class-Oriented Region Attention Module (CRAM) and Class-Oriented Context Fusion Module (CCFM) to analyze the relationships between classes and patch regions and between classes and local or global pixels, respectively. Thus, the perception capability of the model for the detailed part in the aerial image can be enhanced. We leverage these modules to develop an end-to-end semantic segmentation model for aerial images and validate it on the ISPRS Potsdam and Vaihingen datasets. The experimental results show that our model improves the baseline accuracy and outperforms some commonly used CNN architectures.
format Online
Article
Text
id pubmed-8002143
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80021432021-03-28 Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery Shi, Weipeng Qin, Wenhu Yun, Zhonghua Ping, Peng Wu, Kaiyang Qu, Yuke Sensors (Basel) Article It is essential for researchers to have a proper interpretation of remote sensing images (RSIs) and precise semantic labeling of their component parts. Although FCN (Fully Convolutional Networks)-like deep convolutional network architectures have been widely applied in the perception of autonomous cars, there are still two challenges in the semantic segmentation of RSIs. The first is to identify details in high-resolution images with complex scenes and to solve the class-mismatch issues; the second is to capture the edge of objects finely without being confused by the surroundings. HRNET has the characteristics of maintaining high-resolution representation by fusing feature information with parallel multi-resolution convolution branches. We adopt HRNET as a backbone and propose to incorporate the Class-Oriented Region Attention Module (CRAM) and Class-Oriented Context Fusion Module (CCFM) to analyze the relationships between classes and patch regions and between classes and local or global pixels, respectively. Thus, the perception capability of the model for the detailed part in the aerial image can be enhanced. We leverage these modules to develop an end-to-end semantic segmentation model for aerial images and validate it on the ISPRS Potsdam and Vaihingen datasets. The experimental results show that our model improves the baseline accuracy and outperforms some commonly used CNN architectures. MDPI 2021-03-11 /pmc/articles/PMC8002143/ /pubmed/33799737 http://dx.doi.org/10.3390/s21061983 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Shi, Weipeng
Qin, Wenhu
Yun, Zhonghua
Ping, Peng
Wu, Kaiyang
Qu, Yuke
Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
title Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
title_full Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
title_fullStr Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
title_full_unstemmed Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
title_short Attention-Based Context Aware Network for Semantic Comprehension of Aerial Scenery
title_sort attention-based context aware network for semantic comprehension of aerial scenery
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8002143/
https://www.ncbi.nlm.nih.gov/pubmed/33799737
http://dx.doi.org/10.3390/s21061983
work_keys_str_mv AT shiweipeng attentionbasedcontextawarenetworkforsemanticcomprehensionofaerialscenery
AT qinwenhu attentionbasedcontextawarenetworkforsemanticcomprehensionofaerialscenery
AT yunzhonghua attentionbasedcontextawarenetworkforsemanticcomprehensionofaerialscenery
AT pingpeng attentionbasedcontextawarenetworkforsemanticcomprehensionofaerialscenery
AT wukaiyang attentionbasedcontextawarenetworkforsemanticcomprehensionofaerialscenery
AT quyuke attentionbasedcontextawarenetworkforsemanticcomprehensionofaerialscenery