Cargando…
Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks
Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes i...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8218792/ https://www.ncbi.nlm.nih.gov/pubmed/34219969 http://dx.doi.org/10.1016/j.isprsjprs.2021.04.006 |
_version_ | 1783710811888287744 |
---|---|
author | Hua, Yuansheng Mou, Lichao Lin, Jianzhe Heidler, Konrad Zhu, Xiao Xiang |
author_facet | Hua, Yuansheng Mou, Lichao Lin, Jianzhe Heidler, Konrad Zhu, Xiao Xiang |
author_sort | Hua, Yuansheng |
collection | PubMed |
description | Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes in a single image. Therefore, in this paper, we propose to take a step forward to a more practical and challenging task, namely multi-scene recognition in single images. Moreover, we note that manually yielding annotations for such a task is extraordinarily time- and labor-consuming. To address this, we propose a prototype-based memory network to recognize multiple scenes in a single image by leveraging massive well-annotated single-scene images. The proposed network consists of three key components: 1) a prototype learning module, 2) a prototype-inhabiting external memory, and 3) a multi-head attention-based memory retrieval module. To be more specific, we first learn the prototype representation of each aerial scene from single-scene aerial image datasets and store it in an external memory. Afterwards, a multi-head attention-based memory retrieval module is devised to retrieve scene prototypes relevant to query multi-scene images for final predictions. Notably, only a limited number of annotated multi-scene images are needed in the training phase. To facilitate the progress of aerial scene recognition, we produce a new multi-scene aerial image (MAI) dataset. Experimental results on variant dataset configurations demonstrate the effectiveness of our network. Our dataset and codes are publicly available. |
format | Online Article Text |
id | pubmed-8218792 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-82187922021-07-01 Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks Hua, Yuansheng Mou, Lichao Lin, Jianzhe Heidler, Konrad Zhu, Xiao Xiang ISPRS J Photogramm Remote Sens Article Aerial scene recognition is a fundamental visual task and has attracted an increasing research interest in the last few years. Most of current researches mainly deploy efforts to categorize an aerial image into one scene-level label, while in real-world scenarios, there often exist multiple scenes in a single image. Therefore, in this paper, we propose to take a step forward to a more practical and challenging task, namely multi-scene recognition in single images. Moreover, we note that manually yielding annotations for such a task is extraordinarily time- and labor-consuming. To address this, we propose a prototype-based memory network to recognize multiple scenes in a single image by leveraging massive well-annotated single-scene images. The proposed network consists of three key components: 1) a prototype learning module, 2) a prototype-inhabiting external memory, and 3) a multi-head attention-based memory retrieval module. To be more specific, we first learn the prototype representation of each aerial scene from single-scene aerial image datasets and store it in an external memory. Afterwards, a multi-head attention-based memory retrieval module is devised to retrieve scene prototypes relevant to query multi-scene images for final predictions. Notably, only a limited number of annotated multi-scene images are needed in the training phase. To facilitate the progress of aerial scene recognition, we produce a new multi-scene aerial image (MAI) dataset. Experimental results on variant dataset configurations demonstrate the effectiveness of our network. Our dataset and codes are publicly available. Elsevier 2021-07 /pmc/articles/PMC8218792/ /pubmed/34219969 http://dx.doi.org/10.1016/j.isprsjprs.2021.04.006 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hua, Yuansheng Mou, Lichao Lin, Jianzhe Heidler, Konrad Zhu, Xiao Xiang Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks |
title | Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks |
title_full | Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks |
title_fullStr | Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks |
title_full_unstemmed | Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks |
title_short | Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks |
title_sort | aerial scene understanding in the wild: multi-scene recognition via prototype-based memory networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8218792/ https://www.ncbi.nlm.nih.gov/pubmed/34219969 http://dx.doi.org/10.1016/j.isprsjprs.2021.04.006 |
work_keys_str_mv | AT huayuansheng aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks AT moulichao aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks AT linjianzhe aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks AT heidlerkonrad aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks AT zhuxiaoxiang aerialsceneunderstandinginthewildmultiscenerecognitionviaprototypebasedmemorynetworks |