Cargando…
ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection
Object detection is a fundamental task in computer vision. Over the past several years, convolutional neural network (CNN)-based object detection models have significantly improved detection accuracyin terms of average precision (AP). Furthermore, feature pyramid networks (FPNs) are essential module...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181723/ https://www.ncbi.nlm.nih.gov/pubmed/37177636 http://dx.doi.org/10.3390/s23094432 |
_version_ | 1785041642921656320 |
---|---|
author | Park, Hye-Jin Kang, Ji-Woo Kim, Byung-Gyu |
author_facet | Park, Hye-Jin Kang, Ji-Woo Kim, Byung-Gyu |
author_sort | Park, Hye-Jin |
collection | PubMed |
description | Object detection is a fundamental task in computer vision. Over the past several years, convolutional neural network (CNN)-based object detection models have significantly improved detection accuracyin terms of average precision (AP). Furthermore, feature pyramid networks (FPNs) are essential modules for object detection models to consider various object scales. However, the AP for small objects is lower than the AP for medium and large objects. It is difficult to recognize small objects because they do not have sufficient information, and information is lost in deeper CNN layers. This paper proposes a new FPN model named ssFPN (scale sequence (S [Formula: see text]) feature-based feature pyramid network) to detect multi-scale objects, especially small objects. We propose a new scale sequence (S [Formula: see text]) feature that is extracted by 3D convolution on the level of the FPN. It is defined and extracted from the FPN to strengthen the information on small objects based on scale-space theory. Motivated by this theory, the FPN is regarded as a scale space and extracts a scale sequence (S [Formula: see text]) feature by three-dimensional convolution on the level axis of the FPN. The defined feature is basically scale-invariant and is built on a high-resolution pyramid feature map for small objects. Additionally, the deigned S [Formula: see text] feature can be extended to most object detection models based on FPNs. We also designed a feature-level super-resolution approach to show the efficiency of the scale sequence (S [Formula: see text]) feature. We verified that the scale sequence (S [Formula: see text]) feature could improve the classification accuracy for low-resolution images by training a feature-level super-resolution model. To demonstrate the effect of the scale sequence (S [Formula: see text]) feature, experiments on the scale sequence (S [Formula: see text]) feature built-in object detection approach including both one-stage and two-stage models were conducted on the MS COCO dataset. For the two-stage object detection models Faster R-CNN and Mask R-CNN with the S [Formula: see text] feature, AP improvements of up to 1.6% and 1.4%, respectively, were achieved. Additionally, the AP [Formula: see text] of each model was improved by 1.2% and 1.1%, respectively. Furthermore, the one-stage object detection models in the YOLO series were improved. For YOLOv4-P5, YOLOv4-P6, YOLOR-P6, YOLOR-W6, and YOLOR-D6 with the S [Formula: see text] feature, 0.9%, 0.5%, 0.5%, 0.1%, and 0.1% AP improvements were observed. For small object detection, the AP [Formula: see text] increased by 1.1%, 1.1%, 0.9%, 0.4%, and 0.1%, respectively. Experiments using the feature-level super-resolution approach with the proposed scale sequence (S [Formula: see text]) feature were conducted on the CIFAR-100 dataset. By training the feature-level super-resolution model, we verified that ResNet-101 with the S [Formula: see text] feature trained on LR images achieved a 55.2% classification accuracy, which was 1.6% higher than for ResNet-101 trained on HR images. |
format | Online Article Text |
id | pubmed-10181723 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-101817232023-05-13 ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection Park, Hye-Jin Kang, Ji-Woo Kim, Byung-Gyu Sensors (Basel) Article Object detection is a fundamental task in computer vision. Over the past several years, convolutional neural network (CNN)-based object detection models have significantly improved detection accuracyin terms of average precision (AP). Furthermore, feature pyramid networks (FPNs) are essential modules for object detection models to consider various object scales. However, the AP for small objects is lower than the AP for medium and large objects. It is difficult to recognize small objects because they do not have sufficient information, and information is lost in deeper CNN layers. This paper proposes a new FPN model named ssFPN (scale sequence (S [Formula: see text]) feature-based feature pyramid network) to detect multi-scale objects, especially small objects. We propose a new scale sequence (S [Formula: see text]) feature that is extracted by 3D convolution on the level of the FPN. It is defined and extracted from the FPN to strengthen the information on small objects based on scale-space theory. Motivated by this theory, the FPN is regarded as a scale space and extracts a scale sequence (S [Formula: see text]) feature by three-dimensional convolution on the level axis of the FPN. The defined feature is basically scale-invariant and is built on a high-resolution pyramid feature map for small objects. Additionally, the deigned S [Formula: see text] feature can be extended to most object detection models based on FPNs. We also designed a feature-level super-resolution approach to show the efficiency of the scale sequence (S [Formula: see text]) feature. We verified that the scale sequence (S [Formula: see text]) feature could improve the classification accuracy for low-resolution images by training a feature-level super-resolution model. To demonstrate the effect of the scale sequence (S [Formula: see text]) feature, experiments on the scale sequence (S [Formula: see text]) feature built-in object detection approach including both one-stage and two-stage models were conducted on the MS COCO dataset. For the two-stage object detection models Faster R-CNN and Mask R-CNN with the S [Formula: see text] feature, AP improvements of up to 1.6% and 1.4%, respectively, were achieved. Additionally, the AP [Formula: see text] of each model was improved by 1.2% and 1.1%, respectively. Furthermore, the one-stage object detection models in the YOLO series were improved. For YOLOv4-P5, YOLOv4-P6, YOLOR-P6, YOLOR-W6, and YOLOR-D6 with the S [Formula: see text] feature, 0.9%, 0.5%, 0.5%, 0.1%, and 0.1% AP improvements were observed. For small object detection, the AP [Formula: see text] increased by 1.1%, 1.1%, 0.9%, 0.4%, and 0.1%, respectively. Experiments using the feature-level super-resolution approach with the proposed scale sequence (S [Formula: see text]) feature were conducted on the CIFAR-100 dataset. By training the feature-level super-resolution model, we verified that ResNet-101 with the S [Formula: see text] feature trained on LR images achieved a 55.2% classification accuracy, which was 1.6% higher than for ResNet-101 trained on HR images. MDPI 2023-04-30 /pmc/articles/PMC10181723/ /pubmed/37177636 http://dx.doi.org/10.3390/s23094432 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Park, Hye-Jin Kang, Ji-Woo Kim, Byung-Gyu ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection |
title | ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection |
title_full | ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection |
title_fullStr | ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection |
title_full_unstemmed | ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection |
title_short | ssFPN: Scale Sequence (S(2)) Feature-Based Feature Pyramid Network for Object Detection |
title_sort | ssfpn: scale sequence (s(2)) feature-based feature pyramid network for object detection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181723/ https://www.ncbi.nlm.nih.gov/pubmed/37177636 http://dx.doi.org/10.3390/s23094432 |
work_keys_str_mv | AT parkhyejin ssfpnscalesequences2featurebasedfeaturepyramidnetworkforobjectdetection AT kangjiwoo ssfpnscalesequences2featurebasedfeaturepyramidnetworkforobjectdetection AT kimbyunggyu ssfpnscalesequences2featurebasedfeaturepyramidnetworkforobjectdetection |