Cargando…

Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery

BACKGROUND: The objective of this study is twofold. First, ascertain the important variables that predict tomato yields from plant height (PH) and vegetation index (VI) maps. The maps were derived from images taken by unmanned aerial vehicles (UAVs). Second, examine the accuracy of predictions of to...

Descripción completa

Detalles Bibliográficos
Autores principales: Tatsumi, Kenichi, Igarashi, Noa, Mengxue, Xiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8281694/
https://www.ncbi.nlm.nih.gov/pubmed/34266447
http://dx.doi.org/10.1186/s13007-021-00761-2
_version_ 1783722879264751616
author Tatsumi, Kenichi
Igarashi, Noa
Mengxue, Xiao
author_facet Tatsumi, Kenichi
Igarashi, Noa
Mengxue, Xiao
author_sort Tatsumi, Kenichi
collection PubMed
description BACKGROUND: The objective of this study is twofold. First, ascertain the important variables that predict tomato yields from plant height (PH) and vegetation index (VI) maps. The maps were derived from images taken by unmanned aerial vehicles (UAVs). Second, examine the accuracy of predictions of tomato fresh shoot masses (SM), fruit weights (FW), and the number of fruits (FN) from multiple machine learning algorithms using selected variable sets. To realize our objective, ultra-high-resolution RGB and multispectral images were collected by a UAV on ten days in 2020’s tomato growing season. From these images, 756 total variables, including first- (e.g., average, standard deviation, skewness, range, and maximum) and second-order (e.g., gray-level co-occurrence matrix features and growth rates of PH and VIs) statistics for each plant, were extracted. Several selection algorithms (i.e., Boruta, DALEX, genetic algorithm, least absolute shrinkage and selection operator, and recursive feature elimination) were used to select the variable sets useful for predicting SM, FW, and FN. Random forests, ridge regressions, and support vector machines were used to predict the yield using the top five selected variable sets. RESULTS: First-order statistics of PH and VIs collected during the early to mid-fruit formation periods, about one month prior to harvest, were important variables for predicting SM. Similar to the case for SM, variables collected approximately one month prior to harvest were important for predicting FW and FN. Furthermore, variables related to PH were unimportant for prediction. Compared with predictions obtained using only first-order statistics, those obtained using the second-order statistics of VIs were more accurate for FW and FN. The prediction accuracy of SM, FW, and FN by models constructed from all variables (rRMSE = 8.8–28.1%) was better than that from first-order statistics (rRMSE = 10.0–50.1%). CONCLUSIONS: In addition to basic statistics (e.g., average and standard deviation), we derived second-order statistics of PH and VIs at the plant level using the ultra-high resolution UAV images. Our findings indicated that our variable selection method reduced the number variables needed for tomato yield prediction, improving the efficiency of phenotypic data collection and assisting with the selection of high-yield lines within breeding programs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-021-00761-2.
format Online
Article
Text
id pubmed-8281694
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-82816942021-07-16 Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery Tatsumi, Kenichi Igarashi, Noa Mengxue, Xiao Plant Methods Research BACKGROUND: The objective of this study is twofold. First, ascertain the important variables that predict tomato yields from plant height (PH) and vegetation index (VI) maps. The maps were derived from images taken by unmanned aerial vehicles (UAVs). Second, examine the accuracy of predictions of tomato fresh shoot masses (SM), fruit weights (FW), and the number of fruits (FN) from multiple machine learning algorithms using selected variable sets. To realize our objective, ultra-high-resolution RGB and multispectral images were collected by a UAV on ten days in 2020’s tomato growing season. From these images, 756 total variables, including first- (e.g., average, standard deviation, skewness, range, and maximum) and second-order (e.g., gray-level co-occurrence matrix features and growth rates of PH and VIs) statistics for each plant, were extracted. Several selection algorithms (i.e., Boruta, DALEX, genetic algorithm, least absolute shrinkage and selection operator, and recursive feature elimination) were used to select the variable sets useful for predicting SM, FW, and FN. Random forests, ridge regressions, and support vector machines were used to predict the yield using the top five selected variable sets. RESULTS: First-order statistics of PH and VIs collected during the early to mid-fruit formation periods, about one month prior to harvest, were important variables for predicting SM. Similar to the case for SM, variables collected approximately one month prior to harvest were important for predicting FW and FN. Furthermore, variables related to PH were unimportant for prediction. Compared with predictions obtained using only first-order statistics, those obtained using the second-order statistics of VIs were more accurate for FW and FN. The prediction accuracy of SM, FW, and FN by models constructed from all variables (rRMSE = 8.8–28.1%) was better than that from first-order statistics (rRMSE = 10.0–50.1%). CONCLUSIONS: In addition to basic statistics (e.g., average and standard deviation), we derived second-order statistics of PH and VIs at the plant level using the ultra-high resolution UAV images. Our findings indicated that our variable selection method reduced the number variables needed for tomato yield prediction, improving the efficiency of phenotypic data collection and assisting with the selection of high-yield lines within breeding programs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13007-021-00761-2. BioMed Central 2021-07-15 /pmc/articles/PMC8281694/ /pubmed/34266447 http://dx.doi.org/10.1186/s13007-021-00761-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Tatsumi, Kenichi
Igarashi, Noa
Mengxue, Xiao
Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
title Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
title_full Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
title_fullStr Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
title_full_unstemmed Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
title_short Prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
title_sort prediction of plant-level tomato biomass and yield using machine learning with unmanned aerial vehicle imagery
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8281694/
https://www.ncbi.nlm.nih.gov/pubmed/34266447
http://dx.doi.org/10.1186/s13007-021-00761-2
work_keys_str_mv AT tatsumikenichi predictionofplantleveltomatobiomassandyieldusingmachinelearningwithunmannedaerialvehicleimagery
AT igarashinoa predictionofplantleveltomatobiomassandyieldusingmachinelearningwithunmannedaerialvehicleimagery
AT mengxuexiao predictionofplantleveltomatobiomassandyieldusingmachinelearningwithunmannedaerialvehicleimagery