Cargando…

Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images

This study aimed to address the problems of low detection accuracy and inaccurate positioning of small-object detection in remote sensing images. An improved architecture based on the Swin Transformer and YOLOv5 is proposed. First, Complete-IOU (CIOU) was introduced to improve the K-means clustering...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cao, Xuan, Zhang, Yanwei, Lang, Song, Gong, Yan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10098803/ https://www.ncbi.nlm.nih.gov/pubmed/37050694 http://dx.doi.org/10.3390/s23073634

_version_	1785024902124797952
author	Cao, Xuan Zhang, Yanwei Lang, Song Gong, Yan
author_facet	Cao, Xuan Zhang, Yanwei Lang, Song Gong, Yan
author_sort	Cao, Xuan
collection	PubMed
description	This study aimed to address the problems of low detection accuracy and inaccurate positioning of small-object detection in remote sensing images. An improved architecture based on the Swin Transformer and YOLOv5 is proposed. First, Complete-IOU (CIOU) was introduced to improve the K-means clustering algorithm, and then an anchor of appropriate size for the dataset was generated. Second, a modified CSPDarknet53 structure combined with Swin Transformer was proposed to retain sufficient global context information and extract more differentiated features through multi-head self-attention. Regarding the path-aggregation neck, a simple and efficient weighted bidirectional feature pyramid network was proposed for effective cross-scale feature fusion. In addition, extra prediction head and new feature fusion layers were added for small objects. Finally, Coordinate Attention (CA) was introduced to the YOLOv5 network to improve the accuracy of small-object features in remote sensing images. Moreover, the effectiveness of the proposed method was demonstrated by several kinds of experiments on the DOTA (Dataset for Object detection in Aerial images). The mean average precision on the DOTA dataset reached 74.7%. Compared with YOLOv5, the proposed method improved the mean average precision (mAP) by 8.9%, which can achieve a higher accuracy of small-object detection in remote sensing images.
format	Online Article Text
id	pubmed-10098803
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-100988032023-04-14 Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images Cao, Xuan Zhang, Yanwei Lang, Song Gong, Yan Sensors (Basel) Article This study aimed to address the problems of low detection accuracy and inaccurate positioning of small-object detection in remote sensing images. An improved architecture based on the Swin Transformer and YOLOv5 is proposed. First, Complete-IOU (CIOU) was introduced to improve the K-means clustering algorithm, and then an anchor of appropriate size for the dataset was generated. Second, a modified CSPDarknet53 structure combined with Swin Transformer was proposed to retain sufficient global context information and extract more differentiated features through multi-head self-attention. Regarding the path-aggregation neck, a simple and efficient weighted bidirectional feature pyramid network was proposed for effective cross-scale feature fusion. In addition, extra prediction head and new feature fusion layers were added for small objects. Finally, Coordinate Attention (CA) was introduced to the YOLOv5 network to improve the accuracy of small-object features in remote sensing images. Moreover, the effectiveness of the proposed method was demonstrated by several kinds of experiments on the DOTA (Dataset for Object detection in Aerial images). The mean average precision on the DOTA dataset reached 74.7%. Compared with YOLOv5, the proposed method improved the mean average precision (mAP) by 8.9%, which can achieve a higher accuracy of small-object detection in remote sensing images. MDPI 2023-03-31 /pmc/articles/PMC10098803/ /pubmed/37050694 http://dx.doi.org/10.3390/s23073634 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Cao, Xuan Zhang, Yanwei Lang, Song Gong, Yan Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
title	Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
title_full	Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
title_fullStr	Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
title_full_unstemmed	Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
title_short	Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images
title_sort	swin-transformer-based yolov5 for small-object detection in remote sensing images
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10098803/ https://www.ncbi.nlm.nih.gov/pubmed/37050694 http://dx.doi.org/10.3390/s23073634
work_keys_str_mv	AT caoxuan swintransformerbasedyolov5forsmallobjectdetectioninremotesensingimages AT zhangyanwei swintransformerbasedyolov5forsmallobjectdetectioninremotesensingimages AT langsong swintransformerbasedyolov5forsmallobjectdetectioninremotesensingimages AT gongyan swintransformerbasedyolov5forsmallobjectdetectioninremotesensingimages

Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images

Ejemplares similares