Cargando…

Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics

Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal compon...

Descripción completa

Detalles Bibliográficos
Autores principales: Salifu, Daisy, Ibrahim, Eric Ali, Tonnang, Henri E. Z.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9065030/
https://www.ncbi.nlm.nih.gov/pubmed/35505067
http://dx.doi.org/10.1038/s41598-022-11258-w
_version_ 1784699495346339840
author Salifu, Daisy
Ibrahim, Eric Ali
Tonnang, Henri E. Z.
author_facet Salifu, Daisy
Ibrahim, Eric Ali
Tonnang, Henri E. Z.
author_sort Salifu, Daisy
collection PubMed
description Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal component analysis (PCA), canonical variate analysis (CVA) and discriminant analysis (DA). However, advancement in computing power creates a paradigm shift to apply modern tools such as machine learning. Herein, we assess the predictive performance of four machine learning classifiers; K-nearest neighbor (KNN), random forest (RF), support vector machine (the linear, polynomial and radial kernel SVMs) and artificial neural network (ANNs) on fruit fly morphometrics that were previously analysed using PCA and CVA. KNN and RF performed poorly with overall model accuracy lower than “no-information rate” (NIR) (p value > 0.1). The SVM models had a predictive accuracy of > 95%, significantly higher than NIR (p < 0.001), Kappa > 0.78 and area under curve (AUC) of the receiver operating characteristics was > 0.91; while ANN model had a predictive accuracy of 96%, significantly higher than NIR, Kappa of 0.83 and AUC was 0.98. Wing veins 2, 3, 8, 10, 14 and tibia length were of higher importance than other variables based on both SVM and ANN models. We conclude that SVM and ANN models could be used to discriminate fruit fly species based on wing vein and tibia length measurements or any other morphologically similar pest taxa. These algorithms could be used as candidates for developing an integrated and smart application software for insect discrimination and identification. Variable importance analysis results in this study would be useful for future studies for deciding what must be measured.
format Online
Article
Text
id pubmed-9065030
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-90650302022-05-04 Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics Salifu, Daisy Ibrahim, Eric Ali Tonnang, Henri E. Z. Sci Rep Article Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal component analysis (PCA), canonical variate analysis (CVA) and discriminant analysis (DA). However, advancement in computing power creates a paradigm shift to apply modern tools such as machine learning. Herein, we assess the predictive performance of four machine learning classifiers; K-nearest neighbor (KNN), random forest (RF), support vector machine (the linear, polynomial and radial kernel SVMs) and artificial neural network (ANNs) on fruit fly morphometrics that were previously analysed using PCA and CVA. KNN and RF performed poorly with overall model accuracy lower than “no-information rate” (NIR) (p value > 0.1). The SVM models had a predictive accuracy of > 95%, significantly higher than NIR (p < 0.001), Kappa > 0.78 and area under curve (AUC) of the receiver operating characteristics was > 0.91; while ANN model had a predictive accuracy of 96%, significantly higher than NIR, Kappa of 0.83 and AUC was 0.98. Wing veins 2, 3, 8, 10, 14 and tibia length were of higher importance than other variables based on both SVM and ANN models. We conclude that SVM and ANN models could be used to discriminate fruit fly species based on wing vein and tibia length measurements or any other morphologically similar pest taxa. These algorithms could be used as candidates for developing an integrated and smart application software for insect discrimination and identification. Variable importance analysis results in this study would be useful for future studies for deciding what must be measured. Nature Publishing Group UK 2022-05-03 /pmc/articles/PMC9065030/ /pubmed/35505067 http://dx.doi.org/10.1038/s41598-022-11258-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Salifu, Daisy
Ibrahim, Eric Ali
Tonnang, Henri E. Z.
Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
title Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
title_full Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
title_fullStr Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
title_full_unstemmed Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
title_short Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
title_sort leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9065030/
https://www.ncbi.nlm.nih.gov/pubmed/35505067
http://dx.doi.org/10.1038/s41598-022-11258-w
work_keys_str_mv AT salifudaisy leveragingmachinelearningtoolsandalgorithmsforanalysisoffruitflymorphometrics
AT ibrahimericali leveragingmachinelearningtoolsandalgorithmsforanalysisoffruitflymorphometrics
AT tonnanghenriez leveragingmachinelearningtoolsandalgorithmsforanalysisoffruitflymorphometrics