Cargando…
Real-time dense small object detection algorithm based on multi-modal tea shoots
INTRODUCTION: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. METHODS: To solve the problem of low accuracy of dense s...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10391178/ https://www.ncbi.nlm.nih.gov/pubmed/37534292 http://dx.doi.org/10.3389/fpls.2023.1224884 |
_version_ | 1785082645927952384 |
---|---|
author | Shuai, Luyu Chen, Ziao Li, Zhiyong Li, Hongdan Zhang, Boda Wang, Yuchao Mu, Jiong |
author_facet | Shuai, Luyu Chen, Ziao Li, Zhiyong Li, Hongdan Zhang, Boda Wang, Yuchao Mu, Jiong |
author_sort | Shuai, Luyu |
collection | PubMed |
description | INTRODUCTION: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. METHODS: To solve the problem of low accuracy of dense small object detection of tea shoots, this paper proposes a real-time dense small object detection algorithm based on multimodal optimization. First, RGB, depth, and infrared images are collected form a multimodal image set, and a complete shoot object labeling is performed. Then, the YOLOv5 model is improved and applied to dense and tiny tea shoot detection. Secondly, based on the improved YOLOv5 model, this paper designs two data layer-based multimodal image fusion methods and a feature layerbased multimodal image fusion method; meanwhile, a cross-modal fusion module (FFA) based on frequency domain and attention mechanisms is designed for the feature layer fusion method to adaptively align and focus critical regions in intra- and inter-modal channel and frequency domain dimensions. Finally, an objective-based scale matching method is developed to further improve the detection performance of small dense objects in natural environments with the assistance of transfer learning techniques. RESULTS AND DISCUSSION: The experimental results indicate that the improved YOLOv5 model increases the mAP50 value by 1.7% compared to the benchmark model with fewer parameters and less computational effort. Compared with the single modality, the multimodal image fusion method increases the mAP50 value in all cases, with the method introducing the FFA module obtaining the highest mAP50 value of 0.827. After the pre-training strategy is used after scale matching, the mAP values can be improved by 1% and 1.4% on the two datasets. The research idea of multimodal optimization in this paper can provide a basis and technical support for dense small object detection. |
format | Online Article Text |
id | pubmed-10391178 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-103911782023-08-02 Real-time dense small object detection algorithm based on multi-modal tea shoots Shuai, Luyu Chen, Ziao Li, Zhiyong Li, Hongdan Zhang, Boda Wang, Yuchao Mu, Jiong Front Plant Sci Plant Science INTRODUCTION: The difficulties in tea shoot recognition are that the recognition is affected by lighting conditions, it is challenging to segment images with similar backgrounds to the shoot color, and the occlusion and overlap between leaves. METHODS: To solve the problem of low accuracy of dense small object detection of tea shoots, this paper proposes a real-time dense small object detection algorithm based on multimodal optimization. First, RGB, depth, and infrared images are collected form a multimodal image set, and a complete shoot object labeling is performed. Then, the YOLOv5 model is improved and applied to dense and tiny tea shoot detection. Secondly, based on the improved YOLOv5 model, this paper designs two data layer-based multimodal image fusion methods and a feature layerbased multimodal image fusion method; meanwhile, a cross-modal fusion module (FFA) based on frequency domain and attention mechanisms is designed for the feature layer fusion method to adaptively align and focus critical regions in intra- and inter-modal channel and frequency domain dimensions. Finally, an objective-based scale matching method is developed to further improve the detection performance of small dense objects in natural environments with the assistance of transfer learning techniques. RESULTS AND DISCUSSION: The experimental results indicate that the improved YOLOv5 model increases the mAP50 value by 1.7% compared to the benchmark model with fewer parameters and less computational effort. Compared with the single modality, the multimodal image fusion method increases the mAP50 value in all cases, with the method introducing the FFA module obtaining the highest mAP50 value of 0.827. After the pre-training strategy is used after scale matching, the mAP values can be improved by 1% and 1.4% on the two datasets. The research idea of multimodal optimization in this paper can provide a basis and technical support for dense small object detection. Frontiers Media S.A. 2023-07-18 /pmc/articles/PMC10391178/ /pubmed/37534292 http://dx.doi.org/10.3389/fpls.2023.1224884 Text en Copyright © 2023 Shuai, Chen, Li, Li, Zhang, Wang and Mu https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Plant Science Shuai, Luyu Chen, Ziao Li, Zhiyong Li, Hongdan Zhang, Boda Wang, Yuchao Mu, Jiong Real-time dense small object detection algorithm based on multi-modal tea shoots |
title | Real-time dense small object detection algorithm based on multi-modal tea shoots |
title_full | Real-time dense small object detection algorithm based on multi-modal tea shoots |
title_fullStr | Real-time dense small object detection algorithm based on multi-modal tea shoots |
title_full_unstemmed | Real-time dense small object detection algorithm based on multi-modal tea shoots |
title_short | Real-time dense small object detection algorithm based on multi-modal tea shoots |
title_sort | real-time dense small object detection algorithm based on multi-modal tea shoots |
topic | Plant Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10391178/ https://www.ncbi.nlm.nih.gov/pubmed/37534292 http://dx.doi.org/10.3389/fpls.2023.1224884 |
work_keys_str_mv | AT shuailuyu realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots AT chenziao realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots AT lizhiyong realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots AT lihongdan realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots AT zhangboda realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots AT wangyuchao realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots AT mujiong realtimedensesmallobjectdetectionalgorithmbasedonmultimodalteashoots |