Cargando…

Real-time dense small object detection algorithm based on multi-modal tea shoots

INTRODUCTION: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. METHODS: To solve the problem of low accuracy of dense s...

Descripción completa

Detalles Bibliográficos
Autores principales: Shuai, Luyu, Chen, Ziao, Li, Zhiyong, Li, Hongdan, Zhang, Boda, Wang, Yuchao, Mu, Jiong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10391178/
https://www.ncbi.nlm.nih.gov/pubmed/37534292
http://dx.doi.org/10.3389/fpls.2023.1224884
_version_ 1785082645927952384
author Shuai, Luyu
Chen, Ziao
Li, Zhiyong
Li, Hongdan
Zhang, Boda
Wang, Yuchao
Mu, Jiong
author_facet Shuai, Luyu
Chen, Ziao
Li, Zhiyong
Li, Hongdan
Zhang, Boda
Wang, Yuchao
Mu, Jiong
author_sort Shuai, Luyu
collection PubMed
description INTRODUCTION: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. METHODS: To solve the problem of low accuracy of dense small object detection of tea shoots, this paper proposes a real-time dense small object detection algorithm based on multimodal optimization. First, RGB, depth, and infrared images are collected form a multimodal image set, and a complete shoot object labeling is performed. Then, the YOLOv5 model is improved and applied to dense and tiny tea shoot detection. Secondly, based on the improved YOLOv5 model, this paper designs two data layer-based multimodal image fusion methods and a feature layerbased multimodal image fusion method; meanwhile, a cross-modal fusion module (FFA) based on frequency domain and attention mechanisms is designed for the feature layer fusion method to adaptively align and focus critical regions in intra- and inter-modal channel and frequency domain dimensions. Finally, an objective-based scale matching method is developed to further improve the detection performance of small dense objects in natural environments with the assistance of transfer learning techniques. RESULTS AND DISCUSSION: The experimental results indicate that the improved YOLOv5 model increases the mAP50 value by 1.7% compared to the benchmark model with fewer parameters and less computational effort. Compared with the single modality, the multimodal image fusion method increases the mAP50 value in all cases, with the method introducing the FFA module obtaining the highest mAP50 value of 0.827. After the pre-training strategy is used after scale matching, the mAP values can be improved by 1% and 1.4% on the two datasets. The research idea of multimodal optimization in this paper can provide a basis and technical support for dense small object detection.
format Online
Article
Text
id pubmed-10391178
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-103911782023-08-02 Real-time dense small object detection algorithm based on multi-modal tea shoots Shuai, Luyu Chen, Ziao Li, Zhiyong Li, Hongdan Zhang, Boda Wang, Yuchao Mu, Jiong Front Plant Sci Plant Science INTRODUCTION: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. METHODS: To solve the problem of low accuracy of dense small object detection of tea shoots, this paper proposes a real-time dense small object detection algorithm based on multimodal optimization. First, RGB, depth, and infrared images are collected form a multimodal image set, and a complete shoot object labeling is performed. Then, the YOLOv5 model is improved and applied to dense and tiny tea shoot detection. Secondly, based on the improved YOLOv5 model, this paper designs two data layer-based multimodal image fusion methods and a feature layerbased multimodal image fusion method; meanwhile, a cross-modal fusion module (FFA) based on frequency domain and attention mechanisms is designed for the feature layer fusion method to adaptively align and focus critical regions in intra- and inter-modal channel and frequency domain dimensions. Finally, an objective-based scale matching method is developed to further improve the detection performance of small dense objects in natural environments with the assistance of transfer learning techniques. RESULTS AND DISCUSSION: The experimental results indicate that the improved YOLOv5 model increases the mAP50 value by 1.7% compared to the benchmark model with fewer parameters and less computational effort. Compared with the single modality, the multimodal image fusion method increases the mAP50 value in all cases, with the method introducing the FFA module obtaining the highest mAP50 value of 0.827. After the pre-training strategy is used after scale matching, the mAP values can be improved by 1% and 1.4% on the two datasets. The research idea of multimodal optimization in this paper can provide a basis and technical support for dense small object detection. Frontiers Media S.A. 2023-07-18 /pmc/articles/PMC10391178/ /pubmed/37534292 http://dx.doi.org/10.3389/fpls.2023.1224884 Text en Copyright © 2023 Shuai, Chen, Li, Li, Zhang, Wang and Mu https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Shuai, Luyu
Chen, Ziao
Li, Zhiyong
Li, Hongdan
Zhang, Boda
Wang, Yuchao
Mu, Jiong
Real-time dense small object detection algorithm based on multi-modal tea shoots
title Real-time dense small object detection algorithm based on multi-modal tea shoots
title_full Real-time dense small object detection algorithm based on multi-modal tea shoots
title_fullStr Real-time dense small object detection algorithm based on multi-modal tea shoots
title_full_unstemmed Real-time dense small object detection algorithm based on multi-modal tea shoots
title_short Real-time dense small object detection algorithm based on multi-modal tea shoots
title_sort real-time dense small object detection algorithm based on multi-modal tea shoots
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10391178/
https://www.ncbi.nlm.nih.gov/pubmed/37534292
http://dx.doi.org/10.3389/fpls.2023.1224884
work_keys_str_mv AT shuailuyu realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots
AT chenziao realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots
AT lizhiyong realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots
AT lihongdan realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots
AT zhangboda realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots
AT wangyuchao realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots
AT mujiong realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots