Cargando…

Scale Enhancement Pyramid Network for Small Object Detection from UAV Images

Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Jian, Gao, Hongwei, Wang, Xuna, Yu, Jiahui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9689004/
https://www.ncbi.nlm.nih.gov/pubmed/36421553
http://dx.doi.org/10.3390/e24111699
_version_ 1784836413459529728
author Sun, Jian
Gao, Hongwei
Wang, Xuna
Yu, Jiahui
author_facet Sun, Jian
Gao, Hongwei
Wang, Xuna
Yu, Jiahui
author_sort Sun, Jian
collection PubMed
description Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail and semantic information are abundant. Although feature fusion benefits object detection, it still requires the long-range dependencies information necessary for small objects with significant scale variation detection. We propose a simple yet effective scale enhancement pyramid network (SEPNet) to address these problems. A SEPNet consists of a context enhancement module (CEM) and feature alignment module (FAM). Technically, the CEM combines multi-scale atrous convolution and multi-branch grouped convolution to model global relationships. Additionally, it enhances object feature representation, preventing features with lost spatial information from flowing into the feature pyramid network (FPN). The FAM adaptively learns offsets of pixels to preserve feature consistency. The FAM aims to adjust the location of sampling points in the convolutional kernel, effectively alleviating information conflict caused by the fusion of adjacent features. Results indicate that the SEPNet achieves an AP score of 18.9% on VisDrone, which is 7.1% higher than the AP score of state-of-the-art detectors RetinaNet achieves an AP score of 81.5% on PASCAL VOC.
format Online
Article
Text
id pubmed-9689004
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-96890042022-11-25 Scale Enhancement Pyramid Network for Small Object Detection from UAV Images Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui Entropy (Basel) Article Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail and semantic information are abundant. Although feature fusion benefits object detection, it still requires the long-range dependencies information necessary for small objects with significant scale variation detection. We propose a simple yet effective scale enhancement pyramid network (SEPNet) to address these problems. A SEPNet consists of a context enhancement module (CEM) and feature alignment module (FAM). Technically, the CEM combines multi-scale atrous convolution and multi-branch grouped convolution to model global relationships. Additionally, it enhances object feature representation, preventing features with lost spatial information from flowing into the feature pyramid network (FPN). The FAM adaptively learns offsets of pixels to preserve feature consistency. The FAM aims to adjust the location of sampling points in the convolutional kernel, effectively alleviating information conflict caused by the fusion of adjacent features. Results indicate that the SEPNet achieves an AP score of 18.9% on VisDrone, which is 7.1% higher than the AP score of state-of-the-art detectors RetinaNet achieves an AP score of 81.5% on PASCAL VOC. MDPI 2022-11-21 /pmc/articles/PMC9689004/ /pubmed/36421553 http://dx.doi.org/10.3390/e24111699 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sun, Jian
Gao, Hongwei
Wang, Xuna
Yu, Jiahui
Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_full Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_fullStr Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_full_unstemmed Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_short Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_sort scale enhancement pyramid network for small object detection from uav images
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9689004/
https://www.ncbi.nlm.nih.gov/pubmed/36421553
http://dx.doi.org/10.3390/e24111699
work_keys_str_mv AT sunjian scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages
AT gaohongwei scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages
AT wangxuna scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages
AT yujiahui scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages