Cargando…

Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks

Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes i...

Descripción completa

Detalles Bibliográficos
Autores principales: Hua, Yuansheng, Mou, Lichao, Lin, Jianzhe, Heidler, Konrad, Zhu, Xiao Xiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8218792/
https://www.ncbi.nlm.nih.gov/pubmed/34219969
http://dx.doi.org/10.1016/j.isprsjprs.2021.04.006
_version_ 1783710811888287744
author Hua, Yuansheng
Mou, Lichao
Lin, Jianzhe
Heidler, Konrad
Zhu, Xiao Xiang
author_facet Hua, Yuansheng
Mou, Lichao
Lin, Jianzhe
Heidler, Konrad
Zhu, Xiao Xiang
author_sort Hua, Yuansheng
collection PubMed
description Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes in a single image. Therefore, in this paper, we propose to take a step forward to a more practical and challenging task, namely multi-scene recognition in single images. Moreover, we note that manually yielding annotations for such a task is extraordinarily time- and labor-consuming. To address this, we propose a prototype-based memory network to recognize multiple scenes in a single image by leveraging massive well-annotated single-scene images. The proposed network consists of three key components: 1) a prototype learning module, 2) a prototype-inhabiting external memory, and 3) a multi-head attention-based memory retrieval module. To be more specific, we first learn the prototype representation of each aerial scene from single-scene aerial image datasets and store it in an external memory. Afterwards, a multi-head attention-based memory retrieval module is devised to retrieve scene prototypes relevant to query multi-scene images for final predictions. Notably, only a limited number of annotated multi-scene images are needed in the training phase. To facilitate the progress of aerial scene recognition, we produce a new multi-scene aerial image (MAI) dataset. Experimental results on variant dataset configurations demonstrate the effectiveness of our network. Our dataset and codes are publicly available.
format Online
Article
Text
id pubmed-8218792
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-82187922021-07-01 Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks Hua, Yuansheng Mou, Lichao Lin, Jianzhe Heidler, Konrad Zhu, Xiao Xiang ISPRS J Photogramm Remote Sens Article Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes in a single image. Therefore, in this paper, we propose to take a step forward to a more practical and challenging task, namely multi-scene recognition in single images. Moreover, we note that manually yielding annotations for such a task is extraordinarily time- and labor-consuming. To address this, we propose a prototype-based memory network to recognize multiple scenes in a single image by leveraging massive well-annotated single-scene images. The proposed network consists of three key components: 1) a prototype learning module, 2) a prototype-inhabiting external memory, and 3) a multi-head attention-based memory retrieval module. To be more specific, we first learn the prototype representation of each aerial scene from single-scene aerial image datasets and store it in an external memory. Afterwards, a multi-head attention-based memory retrieval module is devised to retrieve scene prototypes relevant to query multi-scene images for final predictions. Notably, only a limited number of annotated multi-scene images are needed in the training phase. To facilitate the progress of aerial scene recognition, we produce a new multi-scene aerial image (MAI) dataset. Experimental results on variant dataset configurations demonstrate the effectiveness of our network. Our dataset and codes are publicly available. Elsevier 2021-07 /pmc/articles/PMC8218792/ /pubmed/34219969 http://dx.doi.org/10.1016/j.isprsjprs.2021.04.006 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hua, Yuansheng
Mou, Lichao
Lin, Jianzhe
Heidler, Konrad
Zhu, Xiao Xiang
Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
title Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
title_full Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
title_fullStr Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
title_full_unstemmed Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
title_short Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
title_sort aerial scene understanding in the wild: multi-scene recognition via prototype-based memory networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8218792/
https://www.ncbi.nlm.nih.gov/pubmed/34219969
http://dx.doi.org/10.1016/j.isprsjprs.2021.04.006
work_keys_str_mv AT huayuansheng aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks
AT moulichao aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks
AT linjianzhe aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks
AT heidlerkonrad aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks
AT zhuxiaoxiang aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks