Cargando…
Scale Enhancement Pyramid Network for Small Object Detection from UAV Images
Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9689004/ https://www.ncbi.nlm.nih.gov/pubmed/36421553 http://dx.doi.org/10.3390/e24111699 |
_version_ | 1784836413459529728 |
---|---|
author | Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui |
author_facet | Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui |
author_sort | Sun, Jian |
collection | PubMed |
description | Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail and semantic information are abundant. Although feature fusion benefits object detection, it still requires the long-range dependencies information necessary for small objects with significant scale variation detection. We propose a simple yet effective scale enhancement pyramid network (SEPNet) to address these problems. A SEPNet consists of a context enhancement module (CEM) and feature alignment module (FAM). Technically, the CEM combines multi-scale atrous convolution and multi-branch grouped convolution to model global relationships. Additionally, it enhances object feature representation, preventing features with lost spatial information from flowing into the feature pyramid network (FPN). The FAM adaptively learns offsets of pixels to preserve feature consistency. The FAM aims to adjust the location of sampling points in the convolutional kernel, effectively alleviating information conflict caused by the fusion of adjacent features. Results indicate that the SEPNet achieves an AP score of 18.9% on VisDrone, which is 7.1% higher than the AP score of state-of-the-art detectors RetinaNet achieves an AP score of 81.5% on PASCAL VOC. |
format | Online Article Text |
id | pubmed-9689004 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-96890042022-11-25 Scale Enhancement Pyramid Network for Small Object Detection from UAV Images Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui Entropy (Basel) Article Object detection is challenging in large-scale images captured by unmanned aerial vehicles (UAVs), especially when detecting small objects with significant scale variation. Most solutions employ the fusion of different scale features by building multi-scale feature pyramids to ensure that the detail and semantic information are abundant. Although feature fusion benefits object detection, it still requires the long-range dependencies information necessary for small objects with significant scale variation detection. We propose a simple yet effective scale enhancement pyramid network (SEPNet) to address these problems. A SEPNet consists of a context enhancement module (CEM) and feature alignment module (FAM). Technically, the CEM combines multi-scale atrous convolution and multi-branch grouped convolution to model global relationships. Additionally, it enhances object feature representation, preventing features with lost spatial information from flowing into the feature pyramid network (FPN). The FAM adaptively learns offsets of pixels to preserve feature consistency. The FAM aims to adjust the location of sampling points in the convolutional kernel, effectively alleviating information conflict caused by the fusion of adjacent features. Results indicate that the SEPNet achieves an AP score of 18.9% on VisDrone, which is 7.1% higher than the AP score of state-of-the-art detectors RetinaNet achieves an AP score of 81.5% on PASCAL VOC. MDPI 2022-11-21 /pmc/articles/PMC9689004/ /pubmed/36421553 http://dx.doi.org/10.3390/e24111699 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Sun, Jian Gao, Hongwei Wang, Xuna Yu, Jiahui Scale Enhancement Pyramid Network for Small Object Detection from UAV Images |
title | Scale Enhancement Pyramid Network for Small Object Detection from UAV Images |
title_full | Scale Enhancement Pyramid Network for Small Object Detection from UAV Images |
title_fullStr | Scale Enhancement Pyramid Network for Small Object Detection from UAV Images |
title_full_unstemmed | Scale Enhancement Pyramid Network for Small Object Detection from UAV Images |
title_short | Scale Enhancement Pyramid Network for Small Object Detection from UAV Images |
title_sort | scale enhancement pyramid network for small object detection from uav images |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9689004/ https://www.ncbi.nlm.nih.gov/pubmed/36421553 http://dx.doi.org/10.3390/e24111699 |
work_keys_str_mv | AT sunjian scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages AT gaohongwei scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages AT wangxuna scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages AT yujiahui scaleenhancementpyramidnetworkforsmallobjectdetectionfromuavimages |