Cargando…

YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception

Precise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and objec...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yipu, Rao, Yuan, Jin, Xiu, Jiang, Zhaohui, Wang, Yuwei, Wang, Tan, Wang, Fengyi, Luo, Qing, Liu, Lu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9823628/
https://www.ncbi.nlm.nih.gov/pubmed/36616628
http://dx.doi.org/10.3390/s23010030
_version_ 1784866206925193216
author Li, Yipu
Rao, Yuan
Jin, Xiu
Jiang, Zhaohui
Wang, Yuwei
Wang, Tan
Wang, Fengyi
Luo, Qing
Liu, Lu
author_facet Li, Yipu
Rao, Yuan
Jin, Xiu
Jiang, Zhaohui
Wang, Yuwei
Wang, Tan
Wang, Fengyi
Luo, Qing
Liu, Lu
author_sort Li, Yipu
collection PubMed
description Precise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and object loss rate. In this paper, a multi-scale collaborative perception network YOLOv5s-FP (Fusion and Perception) was proposed for pear detection, which coupled local and global features. Specifically, a pear dataset with a high proportion of small and occluded pears was proposed, comprising 3680 images acquired with cameras mounted on a ground tripod and a UAV platform. The cross-stage partial (CSP) module was optimized to extract global features through a transformer encoder, which was then fused with local features by an attentional feature fusion mechanism. Subsequently, a modified path aggregation network oriented to collaboration perception of multi-scale features was proposed by incorporating a transformer encoder, the optimized CSP, and new skip connections. The quantitative results of utilizing the YOLOv5s-FP for pear detection were compared with other typical object detection networks of the YOLO series, recording the highest average precision of 96.12% with less detection time and computational cost. In qualitative experiments, the proposed network achieved superior visual performance with stronger robustness to the changes in occlusion and illumination conditions, particularly providing the ability to detect pears with different sizes in highly dense, overlapping environments and non-normal illumination areas. Therefore, the proposed YOLOv5s-FP network was practicable for detecting in-field pears in a real-time and accurate way, which could be an advantageous component of the technology for monitoring pear growth status and implementing automated harvesting in unmanned orchards.
format Online
Article
Text
id pubmed-9823628
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98236282023-01-08 YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception Li, Yipu Rao, Yuan Jin, Xiu Jiang, Zhaohui Wang, Yuwei Wang, Tan Wang, Fengyi Luo, Qing Liu, Lu Sensors (Basel) Article Precise pear detection and recognition is an essential step toward modernizing orchard management. However, due to the ubiquitous occlusion in orchards and various locations of image acquisition, the pears in the acquired images may be quite small and occluded, causing high false detection and object loss rate. In this paper, a multi-scale collaborative perception network YOLOv5s-FP (Fusion and Perception) was proposed for pear detection, which coupled local and global features. Specifically, a pear dataset with a high proportion of small and occluded pears was proposed, comprising 3680 images acquired with cameras mounted on a ground tripod and a UAV platform. The cross-stage partial (CSP) module was optimized to extract global features through a transformer encoder, which was then fused with local features by an attentional feature fusion mechanism. Subsequently, a modified path aggregation network oriented to collaboration perception of multi-scale features was proposed by incorporating a transformer encoder, the optimized CSP, and new skip connections. The quantitative results of utilizing the YOLOv5s-FP for pear detection were compared with other typical object detection networks of the YOLO series, recording the highest average precision of 96.12% with less detection time and computational cost. In qualitative experiments, the proposed network achieved superior visual performance with stronger robustness to the changes in occlusion and illumination conditions, particularly providing the ability to detect pears with different sizes in highly dense, overlapping environments and non-normal illumination areas. Therefore, the proposed YOLOv5s-FP network was practicable for detecting in-field pears in a real-time and accurate way, which could be an advantageous component of the technology for monitoring pear growth status and implementing automated harvesting in unmanned orchards. MDPI 2022-12-20 /pmc/articles/PMC9823628/ /pubmed/36616628 http://dx.doi.org/10.3390/s23010030 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Yipu
Rao, Yuan
Jin, Xiu
Jiang, Zhaohui
Wang, Yuwei
Wang, Tan
Wang, Fengyi
Luo, Qing
Liu, Lu
YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_full YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_fullStr YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_full_unstemmed YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_short YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception
title_sort yolov5s-fp: a novel method for in-field pear detection using a transformer encoder and multi-scale collaboration perception
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9823628/
https://www.ncbi.nlm.nih.gov/pubmed/36616628
http://dx.doi.org/10.3390/s23010030
work_keys_str_mv AT liyipu yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT raoyuan yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT jinxiu yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT jiangzhaohui yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT wangyuwei yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT wangtan yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT wangfengyi yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT luoqing yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception
AT liulu yolov5sfpanovelmethodforinfieldpeardetectionusingatransformerencoderandmultiscalecollaborationperception