Cargando…

Scale Enhancement Pyramid Network for Small Object Detection from UAV Images

Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sun, Jian, Gao, Hongwei, Wang, Xuna, Yu, Jiahui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9689004/ https://www.ncbi.nlm.nih.gov/pubmed/36421553 http://dx.doi.org/10.3390/e24111699

_version_	1784836413459529728
author	Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui
author_facet	Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui
author_sort	Sun, Jian
collection	PubMed
description	Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail and semantic information are abundant. Although feature fusion benefits object detection, it still requires the long-range dependencies information necessary for small objects with significant scale variation detection. We propose a simple yet effective scale enhancement pyramid network (SEPNet) to address these problems. A SEPNet consists of a context enhancement module (CEM) and feature alignment module (FAM). Technically, the CEM combines multi-scale atrous convolution and multi-branch grouped convolution to model global relationships. Additionally, it enhances object feature representation, preventing features with lost spatial information from flowing into the feature pyramid network (FPN). The FAM adaptively learns offsets of pixels to preserve feature consistency. The FAM aims to adjust the location of sampling points in the convolutional kernel, effectively alleviating information conflict caused by the fusion of adjacent features. Results indicate that the SEPNet achieves an AP score of 18.9% on VisDrone, which is 7.1% higher than the AP score of state-of-the-art detectors RetinaNet achieves an AP score of 81.5% on PASCAL VOC.
format	Online Article Text
id	pubmed-9689004
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-96890042022-11-25 Scale Enhancement Pyramid Network for Small Object Detection from UAV Images Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui Entropy (Basel) Article Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail and semantic information are abundant. Although feature fusion benefits object detection, it still requires the long-range dependencies information necessary for small objects with significant scale variation detection. We propose a simple yet effective scale enhancement pyramid network (SEPNet) to address these problems. A SEPNet consists of a context enhancement module (CEM) and feature alignment module (FAM). Technically, the CEM combines multi-scale atrous convolution and multi-branch grouped convolution to model global relationships. Additionally, it enhances object feature representation, preventing features with lost spatial information from flowing into the feature pyramid network (FPN). The FAM adaptively learns offsets of pixels to preserve feature consistency. The FAM aims to adjust the location of sampling points in the convolutional kernel, effectively alleviating information conflict caused by the fusion of adjacent features. Results indicate that the SEPNet achieves an AP score of 18.9% on VisDrone, which is 7.1% higher than the AP score of state-of-the-art detectors RetinaNet achieves an AP score of 81.5% on PASCAL VOC. MDPI 2022-11-21 /pmc/articles/PMC9689004/ /pubmed/36421553 http://dx.doi.org/10.3390/e24111699 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title	Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_full	Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_fullStr	Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_full_unstemmed	Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_short	Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
title_sort	scale enhancement pyramid network for small object detection from uav images
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9689004/ https://www.ncbi.nlm.nih.gov/pubmed/36421553 http://dx.doi.org/10.3390/e24111699
work_keys_str_mv	AT sunjian scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages AT gaohongwei scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages AT wangxuna scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages AT yujiahui scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages

Scale Enhancement Pyramid Network for Small Object Detection from UAV Images

Ejemplares similares